Week 4 - RFM (Logistic Regression) Flashcards

1
Q

Examples of Large Databases

A
  1. Online transactions (e-commerce)
    - Amazon: 300 million customer accounts
  2. Web browsing/click stream data
  3. Purchases at department/ grocery/ convenience stores
    - Albert Heijn:16 million transactions per week
  4. Subscription data
    - Netflix: 200 million-plus subscriber
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why Data Mining?

A
  1. Lots of data being collected
  2. Computers and technology cheaper and more powerful now
  3. Gain a competetive edge
  4. Discover “hidden” info in the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data in the real world is dirty. Why?

A

Lacks values
Errors
Discrepencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Major Tasks in Data Preprocessing

A

Data cleaning: dealing with missing values, inconsistencies
Data integration: integration of multiple databases
Data transfromation (date, time..)
Data reduction: reduced volume, same result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is data mining for?

A
  1. Pattern Discovery
    - finding new, useful patterns in datasets
  2. Relationship Analysis
    - uncover unexpected rel. and summarize
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Examples of Database Marketing Applications

A

Predicting customer response
* Likelihood of future purchase
* Likelihood of churn
* Marketing affectiveness
Market Basket Analysis
Click-stream Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does RFM stand for?

A

Recency = Time passed since last purchase
Frequency = Frequency of purchase in a given period
Monetary value = Amount spent on average in a given period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

RFM + limitations (3)

A

segmentation technique
-accurate
-easy
-can be computed for any database

limitations:
-does not take into account other factors
-prediction of next period only
-past behaviour may be due to PAST MKT activities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Logistic Regression + Types of Logistic Regression (2)

A

Predicts categorical (non metric) outcomes (purchased, not purchased) with two or more categories = (yes (1) / no (0))

if two categories - Binary Logistic Regression
if more than two = Multinominal Logistic Regression (not covered in this course!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Objectives of Logistic Regression

A

Identify
- finds which factors (RFM) influence the likelihood of an event. (purchasing)

Predict
- if a customer will buy based on their RFM scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Logistic Regression Assumptions

A

No specific distribution required
No equal variance needed
Multicolinearity matters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Omibus Test

A

Is our model a better fit than Block 0? (baseline with no IVs)

sig. results = its better to use this model than Block 0 model with no IVs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cox and Snell R square / Negelkerke R square

A

similar to R square in linear regression

-usefulness of the model
-between (COX n.) and (Neg. no.) of the variability in the DV is explained by this set of IVs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Hosmer and Lemeshow Test

A

How well the predicted values match the actual observed values of the DV

A non-significant p-value (greater than 0.05) is what you want.

  • It means there is no significant difference between the predicted and actual values, indicating the model is a good fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Classification Accuracy of the model (Predicted vs Observed table)

A

How well the model predicts whether a purchase is made or not.
(how acurate are the predictions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Exponentiated coefficients Exp(B)

A

shows the magnitude and direction of the effect of each IV on DV

17
Q

Wald

A

similar to t-test