Business Intelligence (BI) And Business Analytics Flashcards

1
Q

The ability to gather and make sense of information in the context of a business

A

Business intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Purpose of business intelligence

A

Gain superior insight and understanding of the business and it’s ecosystem
Understand the past and the present -> predict the future
Make better decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Components of business intelligence

A

Storage (data warehouse/data marts)
Data mining tools (for business analytics)
Reporting and visualization tools (e.g. Dashboards)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is data mining?

A

The computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Customer segmentation

A

Who are the most valuable customers to a girl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Marketing and promotion targeting

A

Identifying which customers will respond to each offer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Market basket analysis

A

Which products customers buy together and how an organization can use this information to cross sell more
Type of association rules mining determining what products go together in a shopping cart at a retailer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Collaborative filtering

A

Personalizing an individual customers experience based on trends and preferences exhibited by similar customers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Customer churn

A

Which customers are more likely to leave and which retention strategies are most likely to succeed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fraud detection

A

Uncover patterns consistent with criminal activity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Financial modeling

A

Building trading systems that adapt to historical trends or risk models to identify customers with the highest likelihood to default on a credit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Five classes of data mining tasks

A
Association detection (can be both)
Clustering (unsupervised)
Classifications (supervised)
Regressions (supervised)
Anomaly/outlier
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Unsupervised data mining

A

Analysts do NOT create the model before running analysis
Apply data mining technique and observe results
Hypothesis created AFTER analysis as explanation for results
Ex. Cluster analysis, cluster creation for collaborative filtering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Supervised data mining

A

Model developed BEFORE analysis
Statistical techniques used to estimate parameters
Ex. Classification, regression analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Association rules mining

A

Determine which behaviors/outcomes go together
Find relationships among attributes in data that frequently occur together
Ex. Products bought together, symptoms and illnesses manifest together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Product affinities

A

Likelihood of two or more products being sold together

17
Q

Support (association rule evaluation)

A

How often do these things appear together?
The probability that things will occur together
s{product1, product2}
SLHS&RHS

18
Q

Confidence (association rule evaluation)

A

Given LHS, how often do we see RHS?
P(RHS|LHS)
sLHS&RHS/sLHS
*asymmetric

19
Q

Lift (association rule evaluation)

A

How often does LHS appear with RHS, compared to how often chance would predict RHS would occur anyway?
The ratio of observed support to the expected support assuming the events are independent
c(LHS->RHS)/s(RHS)
(sLHS&RHS/sLHS)/sRHS
>1 indicates positive correlation (co occurance more likely than chance)
= approx 1 indicates almost no correlation, events are independent
<1 indicates negative correlation - co occurrence is less likely than chance

20
Q

Complementary products lift

A

Greater than 1

21
Q

Substitute products lift

A

Less than 1

22
Q

Cluster analysis

A
Similar records (or characteristics) are grouped together
Does not rely on predefined categories (labels, groups) - records grouped together on the basis of self-similarity (unsupervised data mining)
23
Q

Classification

A

Arrange the data into predefined groups (supervised data mining)

24
Q

Recursive partitioning

A

A technique for creating a decision free to reach the desired level of purity

25
Q

Purity of a subgroup

A

The proportion of its records that belong to the same class

26
Q

Error rate

A

Percent of misclassified records out of the total records in the validation data

27
Q

Positive Predicted, Positive Actual

A

True positive

28
Q

Positive predicted, negative actual

A

False positive

29
Q

Negative predicted, positive actual

A

False negative

30
Q

Negative predicted, negative actual

A

True negative

31
Q

N (sum of predictions)

A

TP + FP + FN + TN

32
Q

Accuracy

A

How often is the classifier correct?
(TP+TN)/N
1 - error rate

33
Q

Misclassification rate (how often is it incorrect?)

A

1 - accuracy

34
Q

Precision

A

When it predicts positive, how often is it correct?

TP/(TP+FP)

35
Q

Specificity

A

When it predicts negative, how often is it correct?

TN/(TN+FN)

36
Q

When might a model with greater total error be chosen?

A

The cost of one kind of misclassification is unacceptably high
Credit default, computer network intrusion, national security risk, presence of cancer