ADBS - Data Warehousing Flashcards
What is data warehousing?
Data warehousing is a set of hardware and software components used to better analyse data to make better business decisions.
What are the benefits of data warehousing?
- Understand business trends and make better forecasting decisions
- Bring better products to market in a more timely manner
- Analyse daily sales information and make quick decisions that can significantly affect your company’s performance
ROI - return on investment can range between 40% and 600%
Competitive Advantage - discover trends, customers and demands
Increase in productivity from decision makers - contains integrated database of consistent, subject oriented, historical data
Transforms data into meaningful information
Name 3 applications of data warehousing
- Sales and marketing analysis across all industries
- Inventory turn and product tracking in manufacturing
- Category management, vendor analysis, and marketing program effectiveness analysis in retail
- Profitability analysis or risk assessment in banking
- Claims analysis or fraud detection in insurance
What is a data warehouse?
A data warehouse is a subject-oriented, integrated, non-volatile, and time-variant collection of data in support of management’s decisions. (Inmon05)
What is subject oriented?
data stored in a warehouse is about the major subjects of an enterprise (such as customers, products, and sales)
What is integrated?
source data for a warehouse is from different enterprise-wide applications systems. The data is often inconsistent, for example, in different formats. The source data must be integrated into a consistent format in order to present a unified view of the data to users.
What is non-volatile?
New data is always added as a supplement to the database, rather than a replacement.
The database continually absorbs this new data, incrementally integrating it with the previous data.
What is time-variant?
data in a warehouse is only accurate and valid at some point in time or over some time interval.
What is Operational Data?
data used to run business. This data is normally stored, retrieved, and updated by an Online Transactional Processing (OLTP) system. An OLTP system may be, for example, a reservations system, an accounting application, or an order entry application.
What are the characteristics of operational data?
Updated often and through online transactions
Non-historical data (not more than three to six months old)
Optimised for transactional processing
Highly normalised in the relational database for easy update, maintenance, and integrity
What is informational data?
Informational data is typically stored in a format that makes analysis much easier. Analysis can be in the form of decision support (queries), report generation, executive information systems, and more in-depth statistical analysis.
What are the characteristics of informational data?
Summarised operational data
De-normalised and replicated data
Infrequently updated from the operational systems
Optimised for decision support applications
Possibly “read only” (no updates allowed)
Stored on separate systems to lessen impact on operational systems
What are differences between operational data and informational data
Operational Data
current, detailed, dynamic, repetitive processing, transaction driven, application oriented, day to day, serves clerical/operational users
Informational
historical data, summarised, static data, ad hoc unstructured heuristic, analysis driven, subject oriented, managerial, strategic
Name 10 problems of data warehousing
Underestimate resources for data loading Hidden problems with source systems Required data not captured Increased end user demands data homogenisation high demand for resources - space data ownership - sensitivity is lost high maintenance time to implement difficulty in integration
Describe the architecture of a data warehouse
Operational data, ODS, Load Manager, Warehouse Manager, Query manager,