Data Warehousing and Business Intelligence Management (Including Big Data) Flashcards
The definition of _______________________: is the planning, implementation, and control processes to provide decision support data and support knowledge workers engaged in reporting, query, and analysis.
Data Warehousing and Business Intelligence
What are the goals of Data Warehousing and Business Intelligence?
- To build and maintain the technical environment and technical and business processes needed to deliver
integrated data in support of operational functions, compliance requirements, and business intelligence
activities. - To support and enable effective business analysis and decision making by knowledge workers.
What are two models and approaches to data warehousing
Inmon and Kimball
One of the component parts of Inmon’s definition of a data warehouse includes: ___________________ where the data warehouse is organized based on major business entities, rather than focusing on a functional or application.
subject oriented
A component part of Inmon’s definition of a data warehouse includes: ___________________ where the data warehouse unified and cohesive. The same key structures, encoding and decoding of structures, data definitions, naming conventions are applied consistently throughout the warehouse
Integrated
A component part of Inmon’s definition of a data warehouse includes: ___________________ where the data warehouse stores data as it exists in a set point in time. Records in the DW are like snapshots. Each one reflects the state of the data at a moment of time. This means that querying data based on a specific time period will always produce the same result, regardless of when the query
is submitted.
Time variant
In Inmon’s DW, __________________ , describes records are not normally updated as they are in operational systems. Instead, new data is appended to existing data. A set of records may represent different states of the same transaction.
non volatiles (records)
When warehouses were first established, cost and space considerations drove the need to ____________ data. This data can be persistent (stored in a table) or non-persistent (rendered in a view) in contemporary DW environments. The deciding factor in whether to persist data is usually performance.
Summarize
The focus of operational systems is current data. Warehouses contain __________ data as well. Often they house vast amounts of it.
Historical
in the context of the Corporate Information Factory (CIF):
__________ perform operational processes. Pulls from these systems bring data into the data warehouse and the operational data stores (ODS) where it can be analyzed.
Applications
in the context of the Corporate Information Factory (CIF): ________ is a database that stands between the operational source databases and the target databases. It is where the extract, transform, and load effort takes place. It is not used by end users. Most data in this area is transient, although typically there is some relatively small amount of persistent data.
Staging Area
in the context of the Corporate Information Factory (CIF):
In the _________ layer, data from disparate sources is transformed
so that it can be integrated into the standard corporate representation / model in the DW and ODS.
integration
An ______________ is integrated database of operational data. It may be sourced directly from applications or from other databases. These databases generally contain current or near term data
(30-90 days), while a DW contains historical data as well (often several years of data). Data in this database is volatile, while warehouse data is stable. Not all organizations use these. They evolved as to meet the need for low latency data. These sometimes serve as the primary source for a data warehouse; it may
also be used to audit a data warehouse.
ODS Operational Data Store
_____________ provide data prepared for analysis. This data is often a sub-set of warehouse data designed to support particular kinds of analysis or a specific group of data consumers. For example, these can aggregate data to support faster analysis. Dimensional modeling (using denormalization techniques) is often used to design user-oriented data marts.
Data marts
An ________ is a data mart focused on tactical decision support. It is
sourced directly from an ODS, rather than from a DW. It shares characteristics of the ODS: it contains current or near-term data. Its contents are volatile.
OpDM Operational Data Mart
The _________ provides a single integration point for corporate data to support management decision-making, and strategic analysis and planning. The data flows into this from the application systems and ODS, and flows out to the data marts, usually in one direction only. Data that needs correction is rejected, corrected at its source, and ideally re-fed through the system.
DW Data Warehouse