Data warehousing Flashcards

1
Q

What is a data warehouse?

A. A storage area for operational data
B. A database optimized for transactional processing
C. A repository for historical and current data used for reporting and analysis
D. A file system for storing large amounts of unstructured data

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following is a characteristic of a data warehouse?

A. Real-time data updates
B. Transaction processing
C. Subject-oriented
D. Unstructured data storage

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is ETL in the context of data warehousing?

A. Extract, Transform, Load
B. Encrypt, Transfer, Load
C. Extract, Test, Log
D. Extract, Track, Learn

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the purpose of OLAP in a data warehouse?

A. Online Link Analysis Processing
B. Online Analytical Processing
C. Online Automated Processing
D. Online Logical Access Processing

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which schema is optimized for data warehouse query performance?

A. Star schema
B. Snowflake schema
C. ER diagram
D. Hierarchical schema

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In data warehousing, a fact table consists of

A. Primary keys
B. Descriptive attributes
C. Dimension keys and measures
D. Foreign keys

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which process involves combining data from multiple sources into a single, coherent data store?

A. Data cleaning
B. Data integration
C. Data staging
D. Data extraction

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the purpose of a data mart?

A. To store detailed data for long-term analysis
B. To store data for operational purposes
C. To store summarized data for a specific group or department
D. To store unstructured data

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following is a key consideration when designing a data warehouse?

A. Minimizing data redundancy
B. Maximizing data normalization
C. Ensuring real-time data updates
D. Incorporating unstructured data

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which component is responsible for providing query results to users in a data warehouse?

A. ETL Server
B. Data Warehouse Server
C. OLAP Server
D. Metadata Server

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the term “dimension” refer to in a data warehouse?

A. It refers to the measurement of data.
B. It represents a collection of facts.
C. It provides context to measures and attributes.
D. It represents the primary key in a fact table.

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which data warehouse architecture involves integrating data from multiple sources before loading it into the warehouse?

A. Federated
B. Centralized
C. Hub-and-Spoke
D. Data Mart

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of metadata in a data warehouse?

A. To store primary data
B. To manage data security
C. To provide information about data characteristics
D. To encrypt data

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which process involves identifying and correcting errors or inconsistencies in data?

A. Data integration
B. Data cleansing
C. Data aggregation
D. Data mining

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the primary function of a data warehouse manager?

A. Designing user interfaces for data access
B. Managing and maintaining data quality
C. Writing complex SQL queries
D. Developing machine learning algorithms

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What role does a star schema play in data warehousing?

A. It provides a visual representation of data relationships.
B. It facilitates data encryption in the warehouse.
C. It optimizes query performance by denormalizing tables.
D. It represents the relationships between dimensions.

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which technique is used to summarize and aggregate data in a data warehouse?

A. Data mining
B. Roll-up
C. Data cleaning
D. Data staging

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the role of surrogate keys in a data warehouse?

A. They act as primary keys in the source system.
B. They are used for indexing data in OLAP.
C. They represent the actual business keys in a dimension table.
D. They uniquely identify rows in a table.

A

D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the purpose of a slowly changing dimension (SCD. in a data warehouse?

A. To track changes in data over time
B. To store data without any changes
C. To eliminate redundant data
D. To improve data loading speed

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which technology is commonly used for data extraction in a data warehouse environment?

A. JDBC (Java Database Connectivity)
B. HTML (Hypertext Markup Language)
C. XML (Extensible Markup Language)
D. API (Application Programming Interface)

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the primary goal of data warehousing?

A. To ensure data security
B. To facilitate data mining
C. To support decision-making processes
D. To replace transactional databases

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which process involves transforming data into a consistent format for analysis in a data warehouse?

A. Data profiling
B. Data normalization
C. Data summarization
D. Data scrubbing

A

D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the purpose of a conformed dimension in a data warehouse?

A. To maintain data consistency across different data marts
B. To store unstructured data
C. To speed up data loading processes
D. To provide real-time analytics

24
Q

What is the purpose of aggregate tables in a data warehouse?

A. To store raw data from source systems
B. To manage data security permissions
C. To improve query performance by pre-calculating summarized data
D. To store metadata information

25
Which technique helps in optimizing query performance by precomputing results in a data warehouse? ## Footnote A. Data partitioning B. Data replication C. Indexing D. Materialized views
D
26
What is the primary role of OLAP cubes in a data warehouse? ## Footnote A. To store unstructured data B. To provide a graphical representation of data relationships C. To facilitate multidimensional analysis D. To perform real-time data updates
C
27
Which type of data warehouse schema is more normalized compared to others? ## Footnote A. Star schema B. Snowflake schema C. Fact constellation schema D. Galaxy schema
B
28
What is the primary challenge associated with real-time data warehousing? ## Footnote A. Ensuring data consistency B. Handling large volumes of historical data C. Managing data integration from multiple sources D. Maintaining data security
A
29
What is the role of a data warehouse administrator? ## Footnote A. Designing data models B. Writing ETL scripts C. Managing user access and permissions D. Developing front-end applications
C
30
Which process involves summarizing and condensing data from various sources in a data warehouse? ## Footnote A. Data profiling B. Data summarization C. Data aggregation D. Data scrubbing
C | .data integration is more accurate ## Footnote Aggregation is a specific process of combining and summarizing data at a higher level of granularity, often involving mathematical operations. Summarization is a broader term that encompasses various techniques to reduce the complexity or volume of data, including aggregation. Integration involves combining data from different sources to provide a unified view, ensuring consistency and coherence across the integrated datasets.
31
Which technology is commonly used for data presentation and reporting in a data warehouse? ## Footnote A. HTML B. OLAP C. XML D. JDBC
B
32
What does a snowflake schema represent in a data warehouse? ## Footnote A. A highly normalized data model B. A denormalized data model C. A hybrid data model D. A representation of metadata
A
33
Which of the following is a benefit of using a data warehouse? ## Footnote A. Real-time transaction processing B. Historical analysis and trend identification C. Unstructured data storage D. Limited data access for end-users
B
34
What is the purpose of a data warehouse metadata repository? ## Footnote A. To store raw data from source systems B. To manage data security permissions C. To provide information about data sources, structures, and relationships D. To perform real-time data updates
C
35
Which term refers to a database design technique that minimizes redundancy and dependency? ## Footnote A. Normalization B. Denormalization C. Aggregation D. Partitioning
A
36
What is the primary purpose of data cleansing in a data warehouse? ## Footnote A. To optimize query performance B. To improve data quality by correcting errors and inconsistencies C. To summarize data for reporting purposes D. To encrypt sensitive data
B
37
What is the primary role of a data warehouse architect? ## Footnote A. Managing user access and permissions B. Writing complex SQL queries C. Designing and implementing data warehouse structures D. Developing front-end applications
C
38
Which data warehouse schema is a combination of multiple star schemas? ## Footnote A. Snowflake schema B. Fact constellation schema C. Galaxy schema D. Hybrid schema
B,aka C
39
Which process involves loading data into a staging area before transferring it to a data warehouse? ## Footnote A. Data scrubbing B. Data integration C. Data staging D. Data mining
C
40
What is the role of drill-down in OLAP analysis? ## Footnote A. To analyze data at a higher level of aggregation B. To view detailed data from a summarized level C. To remove redundant data D. To optimize query performance
B
41
Which of the following is a challenge associated with data warehousing? ## Footnote A. Real-time data processing B. Data duplication C. Limited data storage capacity D. Minimal data integration
B
42
Which process involves consolidating, cleaning, and organizing data from different sources for analysis? ## Footnote A. Data normalization B. Data integration C. Data profiling D. Data summarization
B
43
What is the primary goal of a star schema in data warehousing? ## Footnote A. To optimize storage space B. To reduce data redundancy C. To enhance query performance D. To facilitate real-time updates
C
44
Which type of schema allows more normalized data structures? ## Footnote A. Star schema B. Snowflake schema C. Fact constellation schema D. Hybrid schema
B
45
What is the main advantage of using surrogate keys in a data warehouse? ## Footnote A. They prevent data duplication B. They enhance data security C. They simplify data modeling D. They provide a unique identifier for each row
D
46
Which method is used to extract data from multiple, disparate sources in data warehousing? ## Footnote A. API (Application Programming Interface) B. SQL (Structured Query Language) C. ETL (Extract, Transform, LoaD. D. XML (Extensible Markup Language)
C
47
What is the primary function of OLAP (Online Analytical Processing) in a data warehouse? ## Footnote A. To optimize data extraction B. To facilitate real-time updates C. To support transaction processing D. To perform complex multidimensional analysis
D
48
Which process involves transforming and summarizing data into smaller aggregated sets in data warehousing? ## Footnote A. Data profiling B. Data summarization C. Data aggregation D. Data scrubbing
C
49
What is the primary purpose of a slowly changing dimension (SCD. in a data warehouse? ## Footnote A. To manage rapidly changing data B. To track historical data changes over time C. To store static data without any changes D. To improve data loading performance
B
50
Which technology is commonly used to create OLAP cubes for multidimensional analysis? ## Footnote A. SQL Server B. Excel C. Oracle D. Microsoft Analysis Services
D
51
What is the function of OLTP (Online Transaction Processing) systems in relation to data warehousing? ## Footnote A. OLTP systems are used for long-term data storage. B. OLTP systems provide analytical capabilities. C. OLTP systems facilitate real-time transaction processing. D. OLTP systems optimize data retrieval for analysis.
C
52
Which type of join is most commonly used in data warehousing to link fact and dimension tables? ## Footnote A. Outer join B. Inner join C. Self join D. Cross join
B
53
What is the purpose of a data warehouse staging area? ## Footnote A. To store finalized, processed data for reporting B. To store raw data extracted from various sources temporarily C. To optimize data retrieval speed D. To facilitate real-time data updates
B
54
What is the role of a data warehouse analyst? ## Footnote A. Developing ETL processes B. Managing hardware infrastructure C. Writing SQL queries D. Analyzing and interpreting data for decision-making
D
55
What is the primary function of a data warehouse ETL (Extract, Transform, LoaD. tool? ## Footnote A. To manage user access and permissions B. To design and implement database structures C. To extract, transform, and load data into the warehouse D. To perform real-time analytics
C
56
Which type of data warehouse architecture involves multiple independent data marts connected to a central repository? ## Footnote A. Centralized B. Hub-and-Spoke C. Federated D. Distributed
B