Exam 1 Flashcards

1
Q

Define Predictive Analytics

A

Technology that learns from experience (data) to predict the future behavior of individuals in order o drive better decisions. (John Elder prediction of the stock market)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does PA relate to Organization Learning

A

PA is the process by which an organization learns from experience it has collectively gained across its team members and computer systems. Optimizing operations by making every day routines more efficient. i.e. catch more fraud, avoid bad debtors, lure more online customers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Each Application of PA is defined by what?

A

What’s predicted, and what’s done about it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The Prediction Effect

A

A little prediction goes a long way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Predictive Model

A

A mechanism that predicts behavior of an individual, such as click, buy, lie, or die.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does Machine Learning relate to the Predictive Model

A

It takes characteristics of the individual as input and provides a predictive score as output. The higher the score, the more likely it is that the individual will exhibit the predicted behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do training examples related to prediction

A

The predictive modeling process is a form of automated data crunching that learns from training examples, which must include both positive and negative examples. An organization needs to have positively identified in the pas some cases of what it would like to predict in the future. Positive and Negative examples that a predictive model learns from. What causes a behavior, and what doesn’t cause a behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

5 things that relate to the ethics around policing data?

A

Retain, Access, Share, Merge, React; What can be stored and for how long can it be stored? How do you react to the data, do you use it for something discriminatory? The ethics of analytics all comes down to how you use it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Retain

A

What is stored and for how long

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Access

A

Which employees, types of personnel, or group members may retrieve and look at which data elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Share

A

What data may be disseminated to which parties within the organization, and to what external organizations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Merge

A

What data may be joined together, aggregated, or connected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

React

A

How may each data element be acted upon, determining an organization’s response, outreach, or other behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is predictive analytics the opposite of privacy invasion

A

PA in and of itself does not invade privacy; its core process is the opposite of privacy invasion. PA doesn’t drill down to peer at any individual’s data. Instead, PA actually rolls up learning patterns that hold true in general by way of rote number crunching across the masses of customer needs. Data mining often appears to be a culprit when people misunderstand and completely revers its meaning. Privacy invasion would mean that you are drilling down to the individual person. It aggregates individual data to learn a pattern.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

is PA insight or Intrusion?

A

It depends on what the company does with the data (PA). Determine which employees might quit (or pregnant), and lay them off for this reason, would be bad.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is PA profiling? Why/Why not

A

Depending on how it is used. It is found that most terrorist are Muslim or Arabic, if you use this to discriminate against Muslims or Arabics, then this becomes profiling (specifically in policing).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why is transparency and Accountability so important in using PA for law enforcement?

A

Racism and demographics in relation to criminal background and crimes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the issues around the NSA and PA?

A

Both sides need to understand the other side better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Open Data Movement

A

There’s a ton of data available and free. Easy to obtain data from the web.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How does storage cost influence data availability

A

It has gone from very expensive, to basically 0. It is very flexible and elastic. You can get as much of it as you want…..why don’t you collect that data and keep it?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Data Effect

A

No matter what, data is always predicted. May not be right, but always predicted. Data is always predictive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Predictive Variable

A

The variable that we measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Dependent Variable

A

The variable that is impacted by the predictor variable

24
Q

Frequency

A

The frequency that things occur (or don’t)

25
Q

Behavioral Predictors

A

What’s that individual done, how often, how many times a week, how many times a day, etc.

26
Q

Correlation vs Causation

A

Two things move in the same direction. example… Dow Jones increases when NFC wins the super bowl, this is just by chance. Just happened to occur, one doesn’t cause the other. To establish correlation you have to have a controlled experiment where you hold all things constant other than the variable you are manipulating

27
Q

Fooled by Randomness

A

Randomness, chance, luck. We create reasons for something happening, rather than attributing it to pure chance/luck/randomness. Tend to explain random outcomes as non-random

28
Q

Vast Search Pitfall

A

When you have huge amounts of data, many of the correlations are going to be false

29
Q

Machine Learning

A

Machine learning processes data to produce a predictive model. Data -> Machine Learning -> Predictive model.

30
Q

Micro Risk

A

Any one customer can be demoted to a micro risk. These micro risk can add up to large risks.
PA serves as an antidote to the poisonous accumulation of microrisks. It stands vigil, prospectively flagging each microrisk so the organization can do something about it (credit score). What an organization effectively learns with PA is how to decrease risk by way of anticipating microrisks. All organizations benefit from measuring and predicting the risk of bad behavior (defaults, cancellations, dropouts, accidents, fraud, crime) in this way PA transforms risk to opportunity.

31
Q

Micro Risk and how does PA help mitigate it?

A

Any one customer can be demoted to a micro risk. These micro risk can add up to large risks.
PA serves as an antidote to the poisonous accumulation of microrisks. It stands vigil, prospectively flagging each microrisk so the organization can do something about it (credit score). What an organization effectively learns with PA is how to decrease risk by way of anticipating microrisks. All organizations benefit from measuring and predicting the risk of bad behavior (defaults, cancellations, dropouts, accidents, fraud, crime) in this way PA transforms risk to opportunity.

32
Q

Predictor Variable

A

One factor about the individual.

33
Q

Univariate Model

A

Since it only considers one predictor variable, this is called a univariate model.

34
Q

Multivariate Model

A

Model considers multiple factors at once, instead of just one

35
Q

Decision Tree

A

Flowchart like tree structure, where each internal node denotes a test on an attribute. Each branch represents an outcome of the test. Each leaf holds a class label. Decision tree is the most powerful and popular tool for classification and prediction.

36
Q

How one builds a decision tree from the data

A

Find factors that best breaks down the low risk group even further, into two subgroups, then keep going. Do the same for the high risk group. Divide and conquer, and then divide some more. Don’t go too far. You start at the top and answer yes/no questions to arrive at a leaf. The leaf indicates the model’s predictive output for that individual.

37
Q

How do If-Then-Rules apply to decision tree logic

A

If yes, then go down left. If no, then go down right, etc.

38
Q

Lift

A

Sales in response to marketing. A single metric that compares the performance of predictive models, a kind of predictive multiplier. It tells you how many more target customers you can identify with a model than without one.

39
Q

Overlearning

A

Assuming too much. Mistaking noise for information, assuming too much about what has been shown within data. You’ve overlearned if you’ve read too much into the numbers, led astray from discovering the underlying truth.

40
Q

Induction

A

Reasoning from detailed facts to general principles. Is an art form.

41
Q

Deduction

A

Reasoning from the general to the particular (or from cause to effect). Much more straight forward, it’s applying known rules. “If all men are mortal and Socrates is a man, then deduction tells us Socrates is Mortal.”

42
Q

How training data helps validate PA models

A

It is used to test for overlearning. Hold aside some data to test the model. Randomly select a test set and quarantine it. Use the remaining portion of the data, the training set, to create the model. Then, evaluate the resulting model across the test set. Since the test set was not used to create the model, there’s no way the model would have captured its esoteric aspects, its eccentricities. However well the model does on the test set is a reasonable estimation of how well the model does in general. A true evaluation of its ability to predict.

43
Q

Induction Effect

A

Art drives machine learning; when followed by computer programs, strategies designed in part by informal human creativity succeed in developing predictive models that perform well on new cases. The fact that machine learning works tells us that we humans are smart enough - the hunches and intuitions that drive the design of methods to learn yet not overlearn pan out; i call this the induction effect. Figuring out what makes the best model.

44
Q

What’s gained by crowdsourcing PA?

A

PA crowdsourcing reaps the rewards brought by a diverse brainshare. The crowd almost always outperforms a single person.

45
Q

The Netflix Competition

A

Competition involving crowdsourcing. Challenged the world by requiring that the winner improve upon Netflix’s own established recommendation capabilities by 10%.

46
Q

Coopetition

A

Cooperative competition. Teams ended up working together and splitting the prize. What it was called in the Netflix competition.

47
Q

Why are Ensemble Models stronger?

A

It considers both models predictions on a case by case basis. Trained to predict which cases are weak points for each component model. Where there is a disagreement, teaming the models together provides the opportunity to improve performance.

48
Q

Ensemble Effect

A

When joined in an ensemble, predictive models compensate for one another’s limitations so the ensemble as a whole is more likely to predict correctly than its component models are. Robustness against overlearning. More models merged together, work better than when the models aren’t merged.

49
Q

Why is it difficult for AI to understand language?

A

Because the rules of language are fluent. Really hard to get a computer to understand the shades of meaning. Idioms are difficult to understand.

50
Q

How does PA relate to answering questions like Jeopardy?

A

Answering questions is not prediction, rather its models predicted the correctness of an answer. Given a predictive score to the correctness of the answer.

51
Q

How does Watson work (funnel down)?

A

Given a question, it takes three main steps: Collect thousands of candidate answers, for each answer amass evidence, apply predictive models to funnel down. After gathering thousands of candidate answers to a question, Watson funnels them down to spit out the single answer scored most highly by a predictive model.

52
Q

How Ensemble Models help Watson?

A

Watson incorporates ensembling in three ways: Combining evidence. Separate specialized models for specific question types. For each question, Watson iteratively applies several phases of predictive models, each of which can compensate for mistakes made by prior phases.

53
Q

Churn Modeling

A

Keeping customers by targeting the customers that are predicted to leave. Cheaper to convince a customer to stay than to acquire a new one.

54
Q

Response Modeling

A

Predicts the outcome for those we do contact, but not for those left uncontacted. Predicting which customers will purchase.

55
Q

Uplift Modeling

A

A predictive model that predicts the influence on an individual’s behavior that results from applying one treatment over another. How much more likely is this treatment to generate the desired outcome than the alternative treatment. Send a brochure or don’t? Figuring out the influence of a particular promotion on an individual. free phone or some other incentive. How much more likely is one treatment going to work over another one.

56
Q

Persuasion Effect

A

Although imperceptible, the persuasion of an individual can be predicted by uplift modeling, predictively modeling across two distinct training data sets that record, respectively, the outcomes of two competing treatments. Combine two paradigms: comparing treated and control results, and, predictive modeling. Being influenced is the thing that often happens to you that cannot be witnessed and that you can’t even be sure has happened afterward, but can be predicted in advance.

57
Q

Churn Uplift Modeling

A

Should we provide the customer a retention offer or not (active or passive treatment). Objective is obtain a positive impact of retention campaigns. Which customers can be persuaded to stay, and retention efforts should be targeted to these persuadable customers. Offers could be triggering some customers to leave who would have stayed otherwise. Comparing treated (contacted) and control (not contacted) results, and using predictive modeling. Determining what is the ROI for only contacting the ones that can be persuaded, and avoiding the DND’s and Sure Things. Predicting the customers that can be persuaded to stay, and only send offers out to these customers.