Data warehousing Flashcards

1
Q

what is data warehousing

A

collection of data that helps analysts to make informed decisions in an organisation, allow us to analyse data in a multi-dimensional space, results in data mining and data generalisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what kind of data do data warehouses store

A

very large store of historical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

features of data warehouses

A

subject oriented
integrated
time - variant
non - volatile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what does subject oriented mean

A

provides information around a subject rather than ongoing operations. focuses on modelling and analysis of data. examples of this are product, customers, suppliers, sales, revenue etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what does integrated mean

A

constructed by integrating data from many different sources such as relation databases or flat files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

time - variant

A

data collected is identified with a particular time period e.g. in the last 12 months

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

non - volatile

A

data is not erased when new data is added

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is different in a Data warehouse over a database (only data warehouse points)

A

Queries are often complex and present a general form of data.
it is a database but there is no frequent updating
OLAP query only needs read-only access to stored data
Doesn’t require transaction processing, recovery and concurrency controls as its stored separately from the operational database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is different in an operational database over a database (only operational database points)

A

Built for well-known tasks and work such as searching particular records and indexing.
Support concurrent processing of multiple transactions (OLTP)
Concurrency control and recovery mechanisms are needed.
Allows to read and modify operations
maintains current data - goes through frequent changes on a daily basis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Types of data warehouse

A

Information processing
Analytical processing
Data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is information processing

A

allows processing of data stored in it

can be processed by querying, basic statistical analysis, reporting using cross tabs, tables, charts or graphs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Analytical processing

A

supports analytical processing of the data within it by means of OLAP operations
these are: slice and dice, drill down, drill up and pivot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is data mining

A

support knowledge discovery by finding hidden patterns and associates.
Constructing analytical models, performing classification and prediction.
Results can be presented using visualisation tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Enterprise Data Warehouse(EDW)

A

Services entire enterprise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What might an EDW environment have

A

An EDW(enterprise data warehouse)
an operational datastore
physical and virtual data marts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are data marts

A

they are used by individual departments or groups

use a dimensional data model to build

17
Q

what is an operational data store

A

Subject oriented system, optimised for looking up 1 or 2 records at a time for decision making.
allows access to more current data more quickly
a hybrid form of data warehouse containing integrated information

18
Q

Is a data warehouse OLAP or OLTP

A

OLAP

19
Q

Is a operational database OLAP or OLTP

A

OLTP

20
Q

True or false? Historical processing of information is done by a Data warehouse

A

True, operational databases do day to day processing

21
Q

does a data warehouse or operational database analyse the business

A

A data warehouse analyses the business and operational databases run the business

22
Q

which would be larger in size, a data warehouse or an operational database

A

a datawarehouse(100GB - 100TB)

23
Q

what does the top tier include

A

Data mining, reporting tools for analysis and business intelligence.
Query tools, reporting tools, analysis or data mining tools

24
Q

what does the middle tier include

A

OLAP server
used to provide information to business analysts and managers
interacts with the bottom tier and passes on insights to the top tier tools

25
Q

what does the bottom tier include

A

database server extracts data from different sources using a gateway
data sources fed into the bottom tier can include operational databases and other types of front end data

26
Q

functions of data warehouse tools

A
data extraction
data cleaning
data transformation
data losing
refreshing
27
Q

what is refreshing

A

updating data sources to the warehouse

28
Q

data losing

A

sorting, summarising, consolidating, checking the integrity and building indices and patterns

29
Q

data transformation

A

convert the data to warehouse format

30
Q

data cleaning

A

finding and correcting errors in data

31
Q

data extraction

A

gather data from many different sources

32
Q

metadata

A

data about data which defines the data warehouse

used for building, maintaining and managing the data warehouse

33
Q

Fact tables

A

contain measurements of business(sales, purchase orders, shipment)
can be quite large(50 billion records, 1-5 terabytes)

34
Q

dimension tables

A

store the descriptions of the dimensions of the business (product, customer, vendor, store)
smaller than fact tables

35
Q

what are data warehouses used for

A

analysis and separate to the day-to-day operational database