Classification Flashcards

1
Q

What is the process of modeling (3 things)

A
  1. Real life situation expressed as math
  2. analyze the math
  3. turn math back into real life solution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Model meaning in analytics

A

-regression
-regression based on size weight and distance
-regression extimate = 37 +81xsize + 76xweight + 4xDistance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is classification?

A

putting things into categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data table vocab

A

row - data point
column - attributes, features, covariate, predictor, factor, variable
response /outcome - “answer” or outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is structured data

A

data that can be described and scored in a structured way
ex: quantitative credit score, age, sales
categorical - m/f, hair color

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is unstructured data

A

data that is not easily described and stored
ex- written text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data Types

A

quantative - #s w/ meaning
ex sales age temp income
categorical - # w/o meaning
ex zip codes - higher /lower not meaningful
Binary data - (subset of categorical)
-only 2 values
ex: m/f on off t/f
sometimes quantitative measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data Relations

A

Unrelated
-no relationship between data points
ex: different customers, loan applications

Time series
same data recorded over time
-often recorded @ equal intervals
ex: daily sales, stock prices, child’s height on each birthday

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Support Vector Machine line information

A

m = # data points(rows)
n = # attributes(columns)
xij = jth column of ith data point
yi = response for row i

line
a1x1 + a2x2 + anxn + a0 = 0 OR Sum from j = 1 to n of ajxk + a0 = 0

for classification the line a1x1 + a2x2 + anxn + a0 = +- 1 ( any number, 1 in this case)

you could alos say (a1x1 + a2x2 + anxn + a0 = 0)yi >= 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Distance between support vectors

A

2/ sqrt(sumEj(aj)^2)

aij I think would be the coefficients from both lines

Note:
m = # data points(rows)
n = # attributes(columns)
xij = jth column of ith data point
yi = response for row i

line
a1x1 + a2x2 + anxn + a0 = 0 OR Sum from j = 1 to n of ajxij + a0 = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you maximize the margin in SVM?

A

minimize a0…an sum from j = 1 to n (aj)^2 subject to (a1x1 + a2x2 + anxn + a0 = 0)yi >= 1 for each data point i

Note:
m = # data points(rows)
n = # attributes(columns)
xij = jth column of ith data point
yi = response for row i

line
a1x1 + a2x2 + anxn + a0 = 0 OR Sum from j = 1 to n of ajxij + a0 = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to calculate error in svm?

A

Correct side of line - Sum from j = 1 to n of (ajxij + a0 )yi -1 >= 0
Wrong side of line - Sum from j = 1 to n of (ajxij + a0 )yi -1 < 0
-amount it’s less than 0 is there error

Note:
m = # data points(rows)
n = # attributes(columns)
xij = jth column of ith data point
yi = response for row i

line
a1x1 + a2x2 + anxn + a0 = 0 OR Sum from j = 1 to n of ajxk + a0 = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Svm error for data point

A

max( 0, 1 -(Sum from j = 1 to n of (ajxij + a0 )yi)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SVM total error

A

sum from i = 1 to m (max( 0, 1 -(Sum from j = 1 to n of (ajxij + a0 )yi))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SVM margin denominator

A

sum from j = 1 to n of (aj)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SVM Equation

A

minimize a0… an sum from i = 1 to m max( 0, 1 -(Sum from j = 1 to n of (ajxij + a0 )yi) + lambda sum j = 1 to n (aj)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What happens when lambda (SVM) increases and decreases

A

Lambda control the tradeoff between error and margin. as lambda increases the importance of a larger margin outweigh mistakes in data points.

if lambda decreases the margin term drops to 0 and the importance of correctly classifying outweighs having a large margin

18
Q

what is a support vector?

A

point that holds up the shape
-can support from sides or top
-can have more than one line

19
Q

Support vector machine model

A

-determines support vectors automatically from the data (hence machine)

20
Q

Where is the classifier in relation to the support vectors

A

the classifier is between the two support vectors

21
Q

How can you weight an svm to be more conservative in a direction?

A

-hard classification- if giving a bad loan is 2x as bad as not giving a good loan we could adjust the intercept (2/3(a0-1) + 1/3 (a0+1) or (a0-1/3)
- soft classification - add a multiplier to your error term >1 for more costly errors and <1 for less costly errors
Note: the intercept can be between a0 -1 and a0+1 without making any mistakes on the data (line is still within margin)

22
Q

When you maximize the margin what are you doing?

A

minimizing the sum of squares of the coefficients

23
Q

Do you need to scale data for SVM? ( needs some work)

A

Yes! we are minimizing the sum of squares of the coefficients in order to maximize the margins and that relies on the orders of magnitude being the same

24
Q

In svm what does it mean when a coefficients value is close to 0?

A

it’s not relevant for classification, similar to if your classifier is a vertical line meaning one attribute does not matter

25
Q

Scaling equation

A

xminj = mini xij
xmaxj = maxi xij

for each data point i:
xijscaled = xij- xmin j / xmaxj-xminj

26
Q

Standardization equation

A

factor j has mean j = (sum i = 1 to n xij) / n
j sd = sd j

for each datapoint i
xij standardized = xij - mean j /sd j

27
Q

General scaling

A

xij scaled [a,b] = xij scaled [0,1] (a-b) + b

28
Q

what does scaling do?

A

gets all values between 0 and 1 (or any other bumpers)

29
Q

what does standardization do?

A

scaling to a normal distribution

-commonly mean = 0 and sd = 1

30
Q

When to use scaling?

A

-data required within bounded range
-ex: neural networks
-optimization models that need bounded data
-batting avgs 0 -1
rgb color intensities 0 -255
sat scores 200-800

you can always try both and see what works best

31
Q

When to use standardization?

A

ex - principal component analysis
-clustering

you can always try both and see what works best

32
Q

How does KNN work?

A

KNN counts the # of classes for k closest points. The max class is this data points class

33
Q

KNN considerations

A

-which type of distance to use?-straight line distance
weighted distances
-unimportant attributes can be removed (when weight is close to 0)
-what is a good k value (validation)

34
Q

Which of these is a datapoint?
A survey of 25 people recorded each person’s family size and type of car

-The 14th person’s family size and car type
-14th person’s family size
-the car type of each person

A

-The 14th person’s family size and car type

A data point is all the information about one observation

35
Q

Which of these is structured data?
-a persons twitter feed
-the amount of money in a persons bank account

A

-the amount of money in a persons bank account

every entry will be a number of dollars and cents

36
Q

Which of these is time series data?

-avg cost of a house in us every year since 1820
-the height of each professional basketball player in the nba at the start of the season

A

-avg cost of a house in us every year since 1820

the same thing measured at yearly time intervals

37
Q

Which term measures error in classifying all of the data points

A

sum from j=1 to n max(0,1 - (sum i = 1 to m aixij+a0)yi)

38
Q

When you are multiplying your error term, would a higher number favor or disfavor classification errors?

A

favor

39
Q

Which dataset is scaled between 0 and 1?
- 5,12,27,29
-0.0,0.2,0.6,1.0
-0.3,0.4,0.7,0.75

A

0.0,0.2,0.6,1.0

40
Q

What is the purpose of classification models?

A

putting things into categories
-differentiate

41
Q
A