Business Intelligence (BI) And Business Analytics Flashcards

Question 1

Q

The ability to gather and make sense of information in the context of a business

Answer

A

Business intelligence

Question 2

Q

Purpose of business intelligence

Answer

A

Gain superior insight and understanding of the business and it’s ecosystem
Understand the past and the present -> predict the future
Make better decisions

Question 3

Q

Components of business intelligence

Answer

A

Storage (data warehouse/data marts)
Data mining tools (for business analytics)
Reporting and visualization tools (e.g. Dashboards)

Question 4

Q

What is data mining?

Answer

A

The computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems

Question 5

Q

Customer segmentation

Answer

A

Who are the most valuable customers to a girl

Question 6

Q

Marketing and promotion targeting

Answer

A

Identifying which customers will respond to each offer

Question 7

Q

Market basket analysis

Answer

A

Which products customers buy together and how an organization can use this information to cross sell more
Type of association rules mining determining what products go together in a shopping cart at a retailer

Question 8

Q

Collaborative filtering

Answer

A

Personalizing an individual customers experience based on trends and preferences exhibited by similar customers

Question 9

Q

Customer churn

Answer

A

Which customers are more likely to leave and which retention strategies are most likely to succeed

Question 10

Q

Fraud detection

Answer

A

Uncover patterns consistent with criminal activity

Question 11

Q

Financial modeling

Answer

A

Building trading systems that adapt to historical trends or risk models to identify customers with the highest likelihood to default on a credit

Question 12

Q

Five classes of data mining tasks

Answer

A

Association detection (can be both)
Clustering (unsupervised)
Classifications (supervised)
Regressions (supervised)
Anomaly/outlier

Question 13

Q

Unsupervised data mining

Answer

A

Analysts do NOT create the model before running analysis
Apply data mining technique and observe results
Hypothesis created AFTER analysis as explanation for results
Ex. Cluster analysis, cluster creation for collaborative filtering

Question 14

Q

Supervised data mining

Answer

A

Model developed BEFORE analysis
Statistical techniques used to estimate parameters
Ex. Classification, regression analysis

Question 15

Q

Association rules mining

Answer

A

Determine which behaviors/outcomes go together
Find relationships among attributes in data that frequently occur together
Ex. Products bought together, symptoms and illnesses manifest together

Question 16

Q

Product affinities

Answer

A

Likelihood of two or more products being sold together

Question 17

Q

Support (association rule evaluation)

Answer

A

How often do these things appear together?
The probability that things will occur together
s{product1, product2}
SLHS&RHS

Question 18

Q

Confidence (association rule evaluation)

Answer

A

Given LHS, how often do we see RHS?
P(RHS|LHS)
sLHS&RHS/sLHS
*asymmetric

Question 19

Q

Lift (association rule evaluation)

Answer

A

How often does LHS appear with RHS, compared to how often chance would predict RHS would occur anyway?
The ratio of observed support to the expected support assuming the events are independent
c(LHS->RHS)/s(RHS)
(sLHS&RHS/sLHS)/sRHS
>1 indicates positive correlation (co occurance more likely than chance)
= approx 1 indicates almost no correlation, events are independent
<1 indicates negative correlation - co occurrence is less likely than chance

Question 20

Q

Complementary products lift

Answer

A

Greater than 1

Question 21

Q

Substitute products lift

Answer

A

Less than 1

Question 22

Q

Cluster analysis

Answer

A

Similar records (or characteristics) are grouped together
Does not rely on predefined categories (labels, groups) - records grouped together on the basis of self-similarity (unsupervised data mining)

Question 23

Q

Classification

Answer

A

Arrange the data into predefined groups (supervised data mining)

Question 24

Q

Recursive partitioning

Answer

A

A technique for creating a decision free to reach the desired level of purity

Question 25

Q

Purity of a subgroup

Answer

A

The proportion of its records that belong to the same class

Question 26

Q

Error rate

Answer

A

Percent of misclassified records out of the total records in the validation data

Question 27

Q

Positive Predicted, Positive Actual

Answer

A

True positive

Question 28

Q

Positive predicted, negative actual

Answer

A

False positive

Question 29

Q

Negative predicted, positive actual

Answer

A

False negative

Question 30

Q

Negative predicted, negative actual

Answer

A

True negative

Question 31

Q

N (sum of predictions)

Answer

A

TP + FP + FN + TN

Question 32

Q

Accuracy

Answer

A

How often is the classifier correct?
(TP+TN)/N
1 - error rate

Question 33

Q

Misclassification rate (how often is it incorrect?)

Answer

A

1 - accuracy

Question 34

Q

Precision

Answer

A

When it predicts positive, how often is it correct?

TP/(TP+FP)

Question 35

Q

Specificity

Answer

A

When it predicts negative, how often is it correct?

TN/(TN+FN)

Question 36

Q

When might a model with greater total error be chosen?

Answer

A

The cost of one kind of misclassification is unacceptably high
Credit default, computer network intrusion, national security risk, presence of cancer