Regression Flashcards

1
Q

What’s Gaussian quadrature and what’s disadvantages

A

Fitting n data points to a polynomial of order n+1

This will fit every data set without any errors, bad as data normally includes errors/outliers. Takes a lot of time/power, often desire able to fit to polynomial with fewer parameters than the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the equation for the simple linear regression model

A

Y = ax + b + e

e error term (random variable, normally distributed with mean 0)
a slope
b intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do we find the coefficients in the simple linear regression equation

A

Rearrange for error term and find the ESS (error sum of the squares- sum all of the error values squared) and then partially differentiate the ESS wrt the different variables and set = 0 and solve the linear equations that result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the correlation coefficient, R

A

A measure of the strength of the linear relationship between two variables.

Close to 0 = uncorrelated
Positive = variables are positively correlated
Large = greater degree of correlation

For linear relationships it is equal to sq root of R^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to modify ESS to get RSS, regression sum of the squares

A

Change yi to y bar (mean y value)

ESS represents error of the regressions estimate around its the actual value

RSS represents the error of the regressions estimate around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the TTS total sum of the squares

A

Sum (between i =1 & n): (yi - ymean)^2

Is equal to ESS + RSS

Represents the error of y around it’s mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the R^2 statistic

A

The fraction of the total variance (TSS) accounted for by the regression

R^2 = RSS/TSS

Closer to 1 shows the estimated regression function fits the data better

Useful even when non linear model is used where as R statistic is only useful for describing strength of linear relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly