Business Intelligence Flashcards
What is a Data Mart ?
A subset of data warehouse that supports requirements of particular department.
Characteristics:
Focuses only on requirements of one department
Don’t contain detailed data
More easily understood and navigated
What are the reasons for creating it ?
*Gives access to analyse data thats is asked by users more often
*Import end-user response time
*Cost of implementing is less
*USe less data and simplify ETL
What are the issues of DataMart?
*Functionality
*Size
*Installation
*Load performance
*Internet access
What are the types of DataMart ?
DEPENDENT: Created from a subset of data in DW, doesn’t exist on its own. (Inmons corporate info method)
INDEPENDENT: small-scale DW that supports requirements of particular department. (Kimballs method)
Advantages- easy to build organisationally and technically
Disadvantages-high ETL cost, Business enterprise view unavailable
What are the BI technologies
growth- powerful access tools are needed for advanced capabilities.
Two main types: OLAP & Data Mining they differ in what they offer for the user and that is why they are COMPLEMENTARY TECHNO.
DW+ OLAP/Data mining- Business INtelligence Technologies.
What is OLAP ?
ONLINE ANALYTICAL PROCESSING- enables users to gain deeper understanding about various aspects of their data through fast, consistent access to a wide variety of views of the data. Allows user to view data that it is a better model of true dimensionality.
Type of analysis ranges from nav and browsing to calculation and time series.
Name a few examples and olap applications.
JIT(Just in Time) is computed data that reflects complex relationships and is calculated on the fly. Data model must be flexible as the relationships may not be known in advance.
They have the following features: Multi-dimensional views, time intelligence ad complex calc support.
Examples: Finance- budgeting, financial performance
Sales analysis
Production planning and defect analysis
What is multi-dimensional data ?
We need to develop data cubes(other than 2-D relational model) to have relationships between data.
what is Multi-dimensional Storage Model?
Represented as a table of facts such as sales etc. The association of the data with other tables(DIMENSION TABLES) like location and time.
The FACT TABLE holds actual data or facts relating to each attribute and a foreign key to each dimension table- stores attributes of the dimension like region , product details etc.
What are the multi dimensional Schema ?
Two most common schemas are STAR & SNOWFLAKE.
STAR- there is a fact table with a single table for each dimension and the fact table contains the foreign key for each table.
SNOWFLAKE- there is a fact table but the dimensional tables and organised into a HIERARCHY through normalisation. Each dimension can have it’s own dimensions.
What is the roll-up and cube ?
Both are extensions of the GROUP BY function(Sum, Count, min, max and avg)
What is data mining ?
process of extracting valid, PREVIOUSLY UNKNOWN AND COMPREHENSIBLE INFO from large databases and using it to make business decisions.
Involves analysis of of data and use of software techniques to find hidden patterns and relationships in sets of data.
Reveals info which is hidden/ unexpected as no point in finding relationships that are intuitive. These are identified by examining underlying rules and features in the data.
It can provide huge paybacks for companies who have made huge investments in DW.
What are the Data Mining techniques ?
- PREDICTIVE MODELING:Using observations to form a model of characteristics of some phenomenon.
2.DATABASE SEGMENTATION: Partition DB into an unknown number of segments of similar record.
3.LINK ANALYSIS:Establish links b/w records.
4.DEVIATION DETETCTION: Identifies outliers expressing deviation from previously known expectations.
What are the applications of Data Mining
Retail: market basket analysis, identifying patterns of buying
Banking: detecting patterns of fraud credit card use, identifying loyal customers
Insurance:Claims analysis, predicting who will buy new policies
Medicine: Patient behaviour characterization to predict surgery, successful media therapies for illnesses