Data warehousing Flashcards
what is data warehousing
collection of data that helps analysts to make informed decisions in an organisation, allow us to analyse data in a multi-dimensional space, results in data mining and data generalisation
what kind of data do data warehouses store
very large store of historical data
features of data warehouses
subject oriented
integrated
time - variant
non - volatile
what does subject oriented mean
provides information around a subject rather than ongoing operations. focuses on modelling and analysis of data. examples of this are product, customers, suppliers, sales, revenue etc.
what does integrated mean
constructed by integrating data from many different sources such as relation databases or flat files
time - variant
data collected is identified with a particular time period e.g. in the last 12 months
non - volatile
data is not erased when new data is added
what is different in a Data warehouse over a database (only data warehouse points)
Queries are often complex and present a general form of data.
it is a database but there is no frequent updating
OLAP query only needs read-only access to stored data
Doesn’t require transaction processing, recovery and concurrency controls as its stored separately from the operational database
what is different in an operational database over a database (only operational database points)
Built for well-known tasks and work such as searching particular records and indexing.
Support concurrent processing of multiple transactions (OLTP)
Concurrency control and recovery mechanisms are needed.
Allows to read and modify operations
maintains current data - goes through frequent changes on a daily basis
Types of data warehouse
Information processing
Analytical processing
Data mining
What is information processing
allows processing of data stored in it
can be processed by querying, basic statistical analysis, reporting using cross tabs, tables, charts or graphs
What is Analytical processing
supports analytical processing of the data within it by means of OLAP operations
these are: slice and dice, drill down, drill up and pivot
What is data mining
support knowledge discovery by finding hidden patterns and associates.
Constructing analytical models, performing classification and prediction.
Results can be presented using visualisation tools
What is Enterprise Data Warehouse(EDW)
Services entire enterprise
What might an EDW environment have
An EDW(enterprise data warehouse)
an operational datastore
physical and virtual data marts