Business Analytics Midterm 2 Flashcards

Question 1

Q

Model

Answer

A

A simplified representation of reality created to serve a purpose

Question 2

Q

Predictive model

Answer

A

A formula for estimating the unknown value of interest: the target (formula can be mathematical, logical statement, etc.)

Question 3

Q

Prediction

Answer

A

Estimate an unknown value (the target)

Question 4

Q

Instance/example

Answer

A

Represents a fact or data point. Described by a set of attributes (fields, columns, variables, or features)

Question 5

Q

Model induction

Answer

A

The creation of models from data

Question 6

Q

Training data

Answer

A

The input data for the induction algorithm

Question 7

Q

Beta estimates

Answer

A

“Weights” used to calculate a prediction.

Question 8

Q

Intercept: 1.5
Age: -0.3
Height: 1.2

What is the EQ to predict result of 65 inch person who is 38 years old?

Answer

A

y = 1.5 + (-0.3)(38) + (1.2)(65)

Question 9

Q

Information gain measures…

Answer

A

The change in entropy due to any amount of new information being added. Calculated by subtracting the entropy of children from entropy of parent (multiply each child by its weight)

Question 10

Q

Entropy

Answer

A

Measures the general disorder of a dataset. Ex. a bag with 5 white chips and 5 black has an entropy of 1. 10 black chips has an entropy of 0

Question 11

Q

Why is laplace correction used?

Answer

A

Laplace correction skews probabilities with low sample sizes. Ex. 6 samples, 4 are positive. Chance for next person is 4/6 = 0.6667. With laplace correction chance is 5/8 = 0.625. Decreases probability to be conservative!

Question 12

Q

Two classification problems in creating a model

Answer

A

Target values are discrete with no order. Ex. Single, Married, Divorced, Widowed.
Target values are binary (0 and 1)

Question 13

Q

Classifier model (solution to classification)

Answer

A

Model predicts same set of discrete values as data. Ex. For binary data, model output is 0 or 1

Question 14

Q

Ranking (solution to classification)

Answer

A

Model predicts a score where a higher score means model thinks example is more likely to be in one class.

Question 15

Q

Probability estimation

Answer

A

Model predicts a score between 0 and 1 that is meant to be the probability of being in that class. Ex. Titanic data.

Question 16

Q

Order the three classification solutions from least accurate to most

Answer

Study These Flashcards

A

Classifier model: Don’t use it
Ranking
Probability: You can always rank/classify if you have probabilities

You can always go backwards (to less accurate method) but not forwards

Question 17

Q

Pruning

Answer

Study These Flashcards

A

Simplifies a decision tree to prevent over-fitting

Question 18

Q

Pre-pruning vs. post-pruning

Answer

Study These Flashcards

A

Pre-pruning: Stops growing a branch when information becomes unreliable.
Post-pruning: Takes fully-grown tree and discards unreliable parts.

Post-pruning is preferred!

Business Analytics Midterm 2 Flashcards

(18 cards)