Exam 2 - Chapter 4 Flashcards
Guest speaker
2 questions
THE ROLE OF THE ANALYST
- Transform data into ______
Transform data into Information
THE ROLE OF THE ANALYST
- Master the ______ and _____ needed to provide insights from data
Master the technologies and tools needed to provide insights from data
THE ROLE OF THE ANALYST
- _____ a deep understanding of the ________ _______ and _______
Maintain a deep understanding of the business environment and requirements
THE ROLE OF THE ANALYST
- _____ ______ with data engineers and provide them with _______ and ____ ___
Maintain relationships with data engineers and provide them with requirements and data needs
THE ROLE OF THE ANALYST
- Be able to select the _______ _______ method based on ______ _____ and ______ ______
Be able to select the appropriate analytical method based on business objectives and data structures
pyramid of the role of the analyst
top - the business - receiver of information and knowledge
middle - the analysts - providers of methodology
bottom - the data warehouse - provider of data
Requirements for analysts can be summed up into 3 major competencies:
- Business Competencies
- Method Competencies
- Data Competencies
Business Competencies
Business competencies
Understand the business requirements and desired insights
Method Competencies
Method Competencies
Understand and be able to apply the correct analytical tools and procedures
Data Competencies
Data Competencies
Understand the data you have, the data you need, and how to bridge the gap
SELECTING THE ANALYTICAL METHOD
Question 1-3
Question 1: Determine with the process owner whether the quantitative analytical competencies or the data manager and report developer competencies are required.
Question 2: Determine whether hypothesis‐driven analytics, or data‐driven analytics can be expected to render the best decision support.
Question 3: Determine whether the data‐driven method has the objective of examining the correlation between one given dependent variable and a large number of other variables, or whether the objective is to identify different kinds of structures in data.
EXAMPLES OF DATA – ORIENTED COMPETENCIES
Ad Hoc Reports Manual Reports Automated Reports On Demand Event Driven Self-service Reporting
HYPOTHESIS TESTING (STATISTICAL METHODS) – EXAMPLE QUESTIONS
Comparing data to a benchmark
Comparing data to each other
Are there specific factors about the line that impact the average fill level of the bottles
DATA MINING STEPS WITH TARGET VARIABLES
Step 1: Create the models
Step 2: Evaluate the models
Step 3: Use the models on new data
Classifies objects
Classifies objects into a set of pre-specified object classes (or categories) based on the values of relevant object attributes (features) and objects’ class labels
BENEFITS OF CLASSIFICATION
Identifying the class by a single or a small number of data attributes (e.g., gender, age) is manageable by human decision makers, but not when the number of attributes or the number of instances is large. Estimating/predicting the class or category of action recipient supports time and cost-effective decision making
MOTIVATING BUSINESS QUESTIONS, COSTS AND BENEFITS
- How do we identify mobile phone service CUSTOMERS who are likely to churn (switch to another courier)?
MOTIVATING BUSINESS QUESTIONS, COSTS AND BENEFITS
Churn or not: classes of customers
Which customers: identified by customers’ attribute-value information – e.g., age, income, gender, services subscribed, service utilization, etc.
MOTIVATING BUSINESS QUESTIONS, COSTS AND BENEFITS
- so what?
Potential actions – increase or decrease customer service for customers likely to churn
Costs – increased service cost or loss of loyal customers
Benefits - reduced churn rate or service cost
DATA MINING WITH NO TARGET VARIABLES
Data reduction models Principal Component Analysis (PCA) Clustering models K-Means Market basket models Association Rule Mining
An object (e.g., a customer) has a list of variables (e.g., attributes of a customer such as age, spending, gender etc.)
An object (e.g., a customer) has a list of variables (e.g., attributes of a customer such as age, spending, gender etc.)
When measuring similarity between objects we measure similarity between _____ of objects.
When measuring similarity between objects we measure similarity between variables of objects.
object
- We use distance function to ____ [1 sentence] _____ [2 sentence]
We use distance function to measure dissimilarity between variables. Thus the further the distance between any two objects the more dissimilar they are.
A distance matrix :
A distance matrix can be created with objects as indexes and distance between objects as elements.