5.2 Quiz: Data Warehousing & Business Intelligence Flashcards
A good datawarehouse schema to track events is
a star schema.
a snowflake schema.
a factless fact table.
a fact constellation.
a factless fact table.
Star Schema:
Definition: A data warehouse schema design where a central fact table is connected to dimension tables in a star-like structure.
Characteristics: Simple, with one main fact table at the center and dimension tables surrounding it. Facilitates efficient querying and reporting.
Snowflake Schema:
Definition: A variation of the star schema where dimension tables are normalized into multiple related tables, creating a snowflake-like shape.
Characteristics: More normalized than a star schema, which can save storage space. However, it may introduce additional complexity in querying.
Factless Fact Table:
Definition: A fact table that lacks measures (numeric values) but captures the relationships between dimensions.
Use Cases: Useful for scenarios where you want to record events or associations without numerical data, such as tracking attendance or relationships between entities.
Fact Constellation:
Definition: A data warehouse design with multiple fact tables connected to shared dimension tables.
Characteristics: Involves multiple star or snowflake schemas sharing common dimensions. Suitable for complex analytical scenarios where different types of facts need to be analyzed together.
A data warehouse is integrated because information from different functional domains within an organization are brought together with external data in one centralized system.
True
False
True
Which OLAP operation is shown below?
2x8x4 -> 2x3x2
Dice
Roll-up
Slice
Drill-down
Dice
OLTP databases are usually normalized in order to optimize query performance. This is necessary to support users with quick results for complex queries. A normalized structure will improve the speed with which information can be retrieved from the database.
True
False
False
normalized to reduce redundancy and improve data integrity
denormalization= For systems where quick query performance is a priority,
Analytics is an example of
verification-based Business Intelligence.
discovery-oriented Business Intelligence.
discovery-oriented Business Intelligence.
An independent datawarehouse pulls data directly from the datawarehouse.
True
False
False ->indepent directly
dependent -> directly
Hybrid OLAP (HOLAP) tries to combine the best of both MOLAP and ROLAP. An RDBMS can then be used to store the detailed data in a relational data warehouse whereas the pre-computed aggregated data can be kept as a multidimensional array managed by a MDBMS.
True
False
True
In a star schema, the foreign key in a dimension table, refers to a single record in the fact table.
True
False
False;corresponds to multiple records
As the name suggests, a star schema has one large central fact table which is connected to various smaller dimension tables. As illustrated in this figure, the fact table has multiple foreign keys referring to each of the dimension tables, implementing a 1:N relationship type. The primary key of the fact table consists of the composition of all these foreign keys.
Which statement about Hybrid OLAP (HOLAP) is CORRECT?
Hybrid OLAP (HOLAP) tries to combine the best of both MOLAP and ROLAP.
In HOLAP, an RDBMS can be used to store the detailed data in a relational data warehouse whereas the pre-computed aggregated data can be kept as a multidimensional array managed by a MDBMS.
In HOLAP , the OLAP analysis can first start from the multidimensional database. If more detail is needed (e.g., during drill-down), the analysis can shift to the relational database.
All statements are correct.
All statements are correct.
Which statement about data marts is NOT CORRECT?
A data mart is a scaled down version of a data warehouse aimed at meeting the information needs of a homogeneous small group of end-users such as a department or business unit (e.g., marketing, finance, logistics, HR, etc.).
A data mart typically contains some form of aggregated data and is used as the primary source for report generation and analysis by this end-user group.
Data marts provide focused content such as finance, sales or accounting information, in a format tailored to the user group at hand.
Data marts can also improve query performance by offloading complex queries, and therefore workload, from other data sources (e.g., a data warehouse).
All statements are correct.
All statements are correct.
A snowflake schema
normalizes the dimension tables.
denormalizes the dimension tables.
normalizes the dimension tables.
Consider the following statement : a datawarehouse always contains up to date information. This statement is
True
False
false
A data warehouse is a large, centralized repository of data that is used for reporting and analysis. Unlike operational databases (OLTP databases), which are designed for transactional processing and often contain real-time or near-real-time data, data warehouses are typically updated periodically, such as daily, weekly, or monthly.
Consider the following statement: datawarehouses are usually designed at the logical level since there is no conceptual data model (e.g., ER, UML) available. This statement is
True
False
True
In case front-end tools such as reporting applications and data mining software get direct access to the operational databases, i.e. without the use of a true datawarehouse and/or OLAP server, we denote this as a “virtual datawarehouse”.
True
False
True
The non-volatile characteristic of datawarehouse implies that the data is
read-only.
write-only.
both read-only and write-only.
read-only
Independent data marts are standalone systems, drawing data directly from the operational systems, external sources or a combination of both.
True
False
True
OLAP is an example of verification based Business Intelligence (BI).
True
False
True
What statement about Multidimensional OLAP (MOLAP) is NOT CORRECT?
MOLAP can be fast in terms of data retrieval but it needs more storage to accomplish this.
MOLAP scales easily when the number of dimensions increases.
MDBMSs make use of proprietary data structures and data manipulation languages (DML), so no universal SQL-like standard is provided for data handling, which impedes their adoption.
MDBMSs are not optimized for transaction processing. Updating, inserting or deleting data is usually quite inefficient.
MOLAP scales easily when the number of dimensions increases.
Operational systems focus on
INSERT, UPDATE, DELETE and/or SELECT statements.
only INSERT statements.
only UPDATE statements.
only SELECT statements.
only DELETE statements.
INSERT, UPDATE, DELETE and/or SELECT statements.
In a star schema, every foreign key out of the set of foreign keys forming the composed key of a fact table, refers to one dimension table.
True
False
True
In OLAP, accessing information from 2 or more connected fact tables is an example of
drill-across
drill-up
drill-down
roll-across
drill-across
Drill-Across: This operation involves navigating through different fact tables to combine data from multiple sources. It allows users to analyze data from different dimensions and levels across separate but related fact tables.
An Operational Data Store (ODS) provides no query facilities.
True
False
False
An Operational Data Store (ODS) typically does provide query facilities. An ODS is a database that is used for short-term storage and integration of operational data from various sources. It acts as an intermediate storage and processing layer between the operational systems and the data warehouse.
A datawarehouse is non-volatile because the records are never overwritten.
True
False
True
One part of the ETL process consists in cleaning, integrating and transforming the input data. For most organizations, the design and implementation of these ETL functionalities requires a lot of effort, given that most data sources in organizations not integrated and contain many errors.
True
False
True