Data warehousing Flashcards

1
Q

What is a data warehouse?

A. A storage area for operational data
B. A database optimized for transactional processing
C. A repository for historical and current data used for reporting and analysis
D. A file system for storing large amounts of unstructured data

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following is a characteristic of a data warehouse?

A. Real-time data updates
B. Transaction processing
C. Subject-oriented
D. Unstructured data storage

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is ETL in the context of data warehousing?

A. Extract, Transform, Load
B. Encrypt, Transfer, Load
C. Extract, Test, Log
D. Extract, Track, Learn

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the purpose of OLAP in a data warehouse?

A. Online Link Analysis Processing
B. Online Analytical Processing
C. Online Automated Processing
D. Online Logical Access Processing

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which schema is optimized for data warehouse query performance?

A. Star schema
B. Snowflake schema
C. ER diagram
D. Hierarchical schema

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In data warehousing, a fact table consists of

A. Primary keys
B. Descriptive attributes
C. Dimension keys and measures
D. Foreign keys

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which process involves combining data from multiple sources into a single, coherent data store?

A. Data cleaning
B. Data integration
C. Data staging
D. Data extraction

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the purpose of a data mart?

A. To store detailed data for long-term analysis
B. To store data for operational purposes
C. To store summarized data for a specific group or department
D. To store unstructured data

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following is a key consideration when designing a data warehouse?

A. Minimizing data redundancy
B. Maximizing data normalization
C. Ensuring real-time data updates
D. Incorporating unstructured data

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which component is responsible for providing query results to users in a data warehouse?

A. ETL Server
B. Data Warehouse Server
C. OLAP Server
D. Metadata Server

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the term “dimension” refer to in a data warehouse?

A. It refers to the measurement of data.
B. It represents a collection of facts.
C. It provides context to measures and attributes.
D. It represents the primary key in a fact table.

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which data warehouse architecture involves integrating data from multiple sources before loading it into the warehouse?

A. Federated
B. Centralized
C. Hub-and-Spoke
D. Data Mart

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of metadata in a data warehouse?

A. To store primary data
B. To manage data security
C. To provide information about data characteristics
D. To encrypt data

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which process involves identifying and correcting errors or inconsistencies in data?

A. Data integration
B. Data cleansing
C. Data aggregation
D. Data mining

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the primary function of a data warehouse manager?

A. Designing user interfaces for data access
B. Managing and maintaining data quality
C. Writing complex SQL queries
D. Developing machine learning algorithms

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What role does a star schema play in data warehousing?

A. It provides a visual representation of data relationships.
B. It facilitates data encryption in the warehouse.
C. It optimizes query performance by denormalizing tables.
D. It represents the relationships between dimensions.

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which technique is used to summarize and aggregate data in a data warehouse?

A. Data mining
B. Roll-up
C. Data cleaning
D. Data staging

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the role of surrogate keys in a data warehouse?

A. They act as primary keys in the source system.
B. They are used for indexing data in OLAP.
C. They represent the actual business keys in a dimension table.
D. They uniquely identify rows in a table.

A

D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the purpose of a slowly changing dimension (SCD. in a data warehouse?

A. To track changes in data over time
B. To store data without any changes
C. To eliminate redundant data
D. To improve data loading speed

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which technology is commonly used for data extraction in a data warehouse environment?

A. JDBC (Java Database Connectivity)
B. HTML (Hypertext Markup Language)
C. XML (Extensible Markup Language)
D. API (Application Programming Interface)

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the primary goal of data warehousing?

A. To ensure data security
B. To facilitate data mining
C. To support decision-making processes
D. To replace transactional databases

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which process involves transforming data into a consistent format for analysis in a data warehouse?

A. Data profiling
B. Data normalization
C. Data summarization
D. Data scrubbing

A

D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the purpose of a conformed dimension in a data warehouse?

A. To maintain data consistency across different data marts
B. To store unstructured data
C. To speed up data loading processes
D. To provide real-time analytics

A

A

24
Q

What is the purpose of aggregate tables in a data warehouse?

A. To store raw data from source systems
B. To manage data security permissions
C. To improve query performance by pre-calculating summarized data
D. To store metadata information

A

C

25
Q

Which technique helps in optimizing query performance by precomputing results in a data warehouse?

A. Data partitioning
B. Data replication
C. Indexing
D. Materialized views

A

D

26
Q

What is the primary role of OLAP cubes in a data warehouse?

A. To store unstructured data
B. To provide a graphical representation of data relationships
C. To facilitate multidimensional analysis
D. To perform real-time data updates

A

C

27
Q

Which type of data warehouse schema is more normalized compared to others?

A. Star schema
B. Snowflake schema
C. Fact constellation schema
D. Galaxy schema

A

B

28
Q

What is the primary challenge associated with real-time data warehousing?

A. Ensuring data consistency
B. Handling large volumes of historical data
C. Managing data integration from multiple sources
D. Maintaining data security

A

A

29
Q

What is the role of a data warehouse administrator?

A. Designing data models
B. Writing ETL scripts
C. Managing user access and permissions
D. Developing front-end applications

A

C

30
Q

Which process involves summarizing and condensing data from various sources in a data warehouse?

A. Data profiling
B. Data summarization
C. Data aggregation
D. Data scrubbing

A

C

.data integration is more accurate

Aggregation is a specific process of combining and summarizing data at a higher level of granularity, often involving mathematical operations.
Summarization is a broader term that encompasses various techniques to reduce the complexity or volume of data, including aggregation.
Integration involves combining data from different sources to provide a unified view, ensuring consistency and coherence across the integrated datasets.

31
Q

Which technology is commonly used for data presentation and reporting in a data warehouse?

A. HTML
B. OLAP
C. XML
D. JDBC

A

B

32
Q

What does a snowflake schema represent in a data warehouse?

A. A highly normalized data model
B. A denormalized data model
C. A hybrid data model
D. A representation of metadata

A

A

33
Q

Which of the following is a benefit of using a data warehouse?

A. Real-time transaction processing
B. Historical analysis and trend identification
C. Unstructured data storage
D. Limited data access for end-users

A

B

34
Q

What is the purpose of a data warehouse metadata repository?

A. To store raw data from source systems
B. To manage data security permissions
C. To provide information about data sources, structures, and relationships
D. To perform real-time data updates

A

C

35
Q

Which term refers to a database design technique that minimizes redundancy and dependency?

A. Normalization
B. Denormalization
C. Aggregation
D. Partitioning

A

A

36
Q

What is the primary purpose of data cleansing in a data warehouse?

A. To optimize query performance
B. To improve data quality by correcting errors and inconsistencies
C. To summarize data for reporting purposes
D. To encrypt sensitive data

A

B

37
Q

What is the primary role of a data warehouse architect?

A. Managing user access and permissions
B. Writing complex SQL queries
C. Designing and implementing data warehouse structures
D. Developing front-end applications

A

C

38
Q

Which data warehouse schema is a combination of multiple star schemas?

A. Snowflake schema
B. Fact constellation schema
C. Galaxy schema
D. Hybrid schema

A

B,aka C

39
Q

Which process involves loading data into a staging area before transferring it to a data warehouse?

A. Data scrubbing
B. Data integration
C. Data staging
D. Data mining

A

C

40
Q

What is the role of drill-down in OLAP analysis?

A. To analyze data at a higher level of aggregation
B. To view detailed data from a summarized level
C. To remove redundant data
D. To optimize query performance

A

B

41
Q

Which of the following is a challenge associated with data warehousing?

A. Real-time data processing
B. Data duplication
C. Limited data storage capacity
D. Minimal data integration

A

B

42
Q

Which process involves consolidating, cleaning, and organizing data from different sources for analysis?

A. Data normalization
B. Data integration
C. Data profiling
D. Data summarization

A

B

43
Q

What is the primary goal of a star schema in data warehousing?

A. To optimize storage space
B. To reduce data redundancy
C. To enhance query performance
D. To facilitate real-time updates

A

C

44
Q

Which type of schema allows more normalized data structures?

A. Star schema
B. Snowflake schema
C. Fact constellation schema
D. Hybrid schema

A

B

45
Q

What is the main advantage of using surrogate keys in a data warehouse?

A. They prevent data duplication
B. They enhance data security
C. They simplify data modeling
D. They provide a unique identifier for each row

A

D

46
Q

Which method is used to extract data from multiple, disparate sources in data warehousing?

A. API (Application Programming Interface)
B. SQL (Structured Query Language)
C. ETL (Extract, Transform, LoaD.
D. XML (Extensible Markup Language)

A

C

47
Q

What is the primary function of OLAP (Online Analytical Processing) in a data warehouse?

A. To optimize data extraction
B. To facilitate real-time updates
C. To support transaction processing
D. To perform complex multidimensional analysis

A

D

48
Q

Which process involves transforming and summarizing data into smaller aggregated sets in data warehousing?

A. Data profiling
B. Data summarization
C. Data aggregation
D. Data scrubbing

A

C

49
Q

What is the primary purpose of a slowly changing dimension (SCD. in a data warehouse?

A. To manage rapidly changing data
B. To track historical data changes over time
C. To store static data without any changes
D. To improve data loading performance

A

B

50
Q

Which technology is commonly used to create OLAP cubes for multidimensional analysis?

A. SQL Server
B. Excel
C. Oracle
D. Microsoft Analysis Services

A

D

51
Q

What is the function of OLTP (Online Transaction Processing) systems in relation to data warehousing?

A. OLTP systems are used for long-term data storage.
B. OLTP systems provide analytical capabilities.
C. OLTP systems facilitate real-time transaction processing.
D. OLTP systems optimize data retrieval for analysis.

A

C

52
Q

Which type of join is most commonly used in data warehousing to link fact and dimension tables?

A. Outer join
B. Inner join
C. Self join
D. Cross join

A

B

53
Q

What is the purpose of a data warehouse staging area?

A. To store finalized, processed data for reporting
B. To store raw data extracted from various sources temporarily
C. To optimize data retrieval speed
D. To facilitate real-time data updates

A

B

54
Q

What is the role of a data warehouse analyst?

A. Developing ETL processes
B. Managing hardware infrastructure
C. Writing SQL queries
D. Analyzing and interpreting data for decision-making

A

D

55
Q

What is the primary function of a data warehouse ETL (Extract, Transform, LoaD. tool?

A. To manage user access and permissions
B. To design and implement database structures
C. To extract, transform, and load data into the warehouse
D. To perform real-time analytics

A

C

56
Q

Which type of data warehouse architecture involves multiple independent data marts connected to a central repository?

A. Centralized
B. Hub-and-Spoke
C. Federated
D. Distributed

A

B