! Supervised Learning: Regression Flashcards

Question 1

Q

Regression

Answer

A

finding & predicting relationship between indepent & continuous output/dependent variables
e.g. predict price of car given set of features (age, brand)
supervised ML

Question 2

Q

Regression - Tasks

Answer

A

Linear Regression
Neural Networks

Question 3

Q

Linear Regression - Goal

Answer

A

learn a linear model that we can use to predict a new y given a previously unseen x with as little error as possible
parametic

Question 4

Q

Linear Regression - Method

Answer

A

Linear equation: y’ = ß0 + ß1x1 + … + ßixi
estimating coefficient(s) (weights wi, bias b) so that
predicts continous variable (y’) based on other variable(s) (xi) in best way
Finding straight line minimizing distance between predicted & actual output

Question 5

Q

Linear Regression - Steps

Answer

A

1. Define cost / loss function: measures inaccuracy of models prediction
1. Find parameters minimizing loss function: make model as accurate as possible -> Gradient Descent

Question 6

Q

Linear Regression - Steps - Gradient Descent

Answer

A

method to find minimum of model’s (y’ = ß0 + ß1x1 + … + ßixi) loss function by iteratively process
Function f(ß0, ß1) = z <- focus on ß1 & ß0 (other variables of cost f given)
Guess ß0 & ß1
[dz/dβ0, dz/dβ1] <- Get partial derivatives of loss function with respect to each beta (= how much total loss is increased or decreased if increase ß0 or ß1 by very small amount)
Adjust ß0 & ß1 accordingly: if partial derivative < 0 -> increase ß1; > 0 -> decrease ß2
Repeat 3 & 4 until partial derivative = 0

Question 7

Q

Linear Regression - Steps - define cost function

Answer

A

a) Take average of
b) Square (-> no negative numbers penalizing large differences)
c) the sum of all differences between each data point (yi) & model predictor (ß1xi + ßo)

Question 8

Q

Linear Regression - Solution over- & underfitting

Answer

A

1. Use more training data
1. Use regularization = penalty to loss function when model assigns too much power to 1 feature or to too many features (c + Lambda (sum of ßi^2) = hyperparameter -> higher: more penalty)

Question 9

Q

Regression - Types

Answer

A

simple = 1 independent variable <- > Multiple > 1 independent variable

Question 10

Q

Linear Regression - Assumptions / Dissdvantages

Answer

A

variables = quantitative (Categorical -> binary); measure at continous level
No significant outliers
Residuals (errors) of best-fit regression line = normal distribution
relationship between dependent & independent variables = linear
all observations = independent

Question 11

Q

Supervised Learning Goal

Answer

A

find relation btw input & output data based on already knows answers (labeled data)
apply to predict outcomes of new variables

Question 12

Q

Supervised application

Answer

A

image classification (dog or cat?), fraud detection, spam filtering

Question 13

Q

Logistic Regression

Answer

A

predict discrete class based on probability
put linear regression f(x) in sigmoid function P(Y=1) = 1 / (1+e^-lf(x)) -> result = probability btw 0 & 1
threshold: calss decision based on tolerance false + / -

Question 14

Q

Logistic Regression - Logg-odds-ratio

Answer

A

gonna die ln(p to (1-p) ) log-ods

Question 15

Q

Logistic Regression - Con

Answer

A

risk of overfitting if large number of independent variables
need sample of even categories

! Supervised Learning: Regression Flashcards

(15 cards)