Business Intelligence and Big Data (Week 8) Flashcards
What is Business Intelligence (BI)?
Tools and processes for analysing external and internal data and decision-making
Examples of BI
–> Data Warehousing
–> Dashboards
–> OLAP (Online Analytical Processing)
–> Data mining analytics
Availability of data
Data can be sourced from:
–> social media
–> ‘open data’ e.g. google maps
–> ‘smart’ devices connected to the internet
What drives BI?
–> Implementing performance management systems
–> Adhering to new regulations
–> Prioritizing Customer Relationship Management (CRM) and personalised marketing
–> Adapting to market trends like globalization and mergers
–> Embracing digital business, marketing, and social media
–> Fostering a data-driven organizational culture focused on analytics
What is Corporate Performance Management (CPM)?
Performance management that aids strategic decisions. Also includes processes and tech for measuring and monitoring performance
What is Customer Relations Management (CRM)?
Systems that manage customer interactions and maximise the customer lifetime value for the firm
Analytical CRM
Analysis of customer data to provide insights or models to optimise aspects of our customer relationships
e.g. which customer segment to target for retention campaign
Operational CRM
Systems supporting customer-facing processes
e.g. call-centres and customer service support
What is Regulatory Compliance?
Rules that firms have to abide to. Digital tech allows regulatory bodies to monitor and manage more efficiently
What is Data Warehousing?
A database of copied transactions that is to be then analysed and aid decision making
Subject-Orientated Data Warehousing
Organised around the major subjects of an enterprise (e.g. customers, products, and sales) rather than the major application areas (e.g. customer invoicing, stock control, and order processing)
Integrated Data Warehousing
Combines data from different sources to keep a consistent and unified perspective for analysis. (Centralised and cross-functional)
Time-Variant Data Warehousing
Data that is accompanied with time to help provide historic accuracy
Non-Volatile Data Warehousing
New data is always added as a supplement to the data warehouse, rather than as a replacement
What is Extract, Transformation, Load (ETL)?
Tools that set up and configure an automated system that regularly updates the data warehouse
E (extract): data from source systems
T (Transform): data
L (Load): transformed data into the data warehouse
What is Big Data?
Data that has high:
–> Volume (size of data)
–> Velocity (how fast new data is generated)
–> Variety (many different forms)
What is a Data Lake?
Data that is stored in its original format because it has potential to be analysed and used for decision making
What is Hadoop?
Open-source framework that has become popular for distributed storage and parallel processing of massive amounts of data
What is Online Analytical Processing (OLAP)?
Interactive analysis of large volumes of data from multiple dimensions
What is the 3 Elements of Multi-Dimensional Analysis
–> Dimensions: perspectives from which to analyse data
e.g. time, product, geography, etc
–> Hierarchies within a dimension: level of detail
e.g. geography: world region – country – city – shop
–> Numeric values such as units, revenue, cost, etc.,
or values derived from them, e.g. profit
What is Drilling?
Navigating through a dimension hierarchy to desired level of detail
What is Drilling Down?
Go down the hierarchy or introduce extra dimension
e.g. –> Total sales
–> Total sales per city
–>Total sales per city per –> shop
What is Drilling Up?
Climb up hierarchy or reduce dimensions
(e.g. get measure at more whole level)
What is Drilling Across?
Within same dimension select another attribute value
e.g. After viewing the results for 2011, change the selection to 2012
What is Slicing?
Take horizontal or vertical cut of cube, i.e. restrict one dimension
e.g. –> Sales data for product X
–> Sales data for shop A
What is Dicing?
Restricting two or more dimensions
e.g. Sales data for products X and Y, in shops A and B, during the summer
Disadvantages of using OLAP
–> Inefficient to manually investigate 10,000’s of data
–> No prediction for the future
–> Customer attrition (customers lost)
What is Data Mining/ Analytics?
Applying computational techniques to find interesting patterns or derive a predictive model
What is Predictive Analytics?
Using past data to predict future outcomes for individuals based on observable variables
3 tasks:
–> Classification
–> Regression or estimation
–> Forecasting
What is Descriptive Analytics?
Identifying and describing patterns present in the data
Via:
–> Association Analysis
–> Segmentation/ clustering
Predictive Analytics: Classification
Use input variables to classify subject into one of two or more predefined target classes
(e.g. predict whether individual customer will be good or bad payer)
Example Models: Decision tree, Scorecard
Predictive Analytics: Regression/ estimating
Predict value of a continuous (numeric) target variable
(e.g. profit in GBP, loss, etc.)
Predictive Analytics: Forecasting
Regression over time-series data
Descriptive Analytics: Association
Detect frequently occurring patterns of items in a large transaction database
Descriptive Analytics: Segmentation/ Clustering
Identify clusters or segments of homogenous subjects
(e.g. having similar values for a series of variables)
Types of Big Data Analytics
–> Text Mining
–> Image Processing
–> Social Network Analytics