Intro to linear models Flashcards

1
Q

what is a model?

A

a formal representation of a system - baso all statistics is about models.

we represent mathematical models as functions, giving it arguments and operations that allow us to make predictions (which can be tested)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a good model?

A
  1. model is represented as a function
  2. the function is represented as a line
  3. the line yields predictions we would expect if the model were true
  4. when we collect more data to test the predictions, it matches the model well
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a linear model?

A

model of a linear relationship
we use linear models to try and explain variation in an outcome (DV) using one or more predictors (IVs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the intercept?

A

point where our model line crosses y, when x = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the slope?

A

the gradient of the model line or rate of change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

linear model equation:

A

yi = β0 + β1xi + ϵ

y = intercept + slope of x + residual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the residual?

A

a measure of how well the model fits each data point.
it is the distance (on the y axis) between the model line and each data point

residuals should be:
- normally distributed
- mean of 0
- sd of σ (means spread of the errors should constant)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a least square?

A

minimise the residual sum of squares
meaning they minimize the distance between the observed value of y and the model predicted value (^y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

residual sum of squares (SSresidual)

A

equation:
SS residual = (observed y value - model estimated y value) squared and then added up for all observations

minimising the SS residual means that our predicted values are as close as they can get to each of our observed values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

calculating intercept

A

intercept = mean of y - slope estimate of the mean of x
(see slope calculations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

calculating slope

A

β1 = SPxy / SSx
slope = sum of cross products / sums of squared deviations of x

SPxy = sum of cross products = (observed x value - sample mean of x) * (observed y value - sample mean of y) added up for all observations

SSx = sums if squared deviations of x = (observed x value - sample mean of x) squared, then added up for all observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

example interpretation of a linear model for how hours of study effects test score:

A

intercept = value of y when x is 0 e.g. expected test score for a student who studied 0 hours

slope = change in y for a unit increase in x = expected increase (or decrease) in test score for every additional hour studied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

estimated sd of the error (residual) equation

A

^σ = square root of (SS residual / n-k-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is multiple regression?

A

when a linear model has multiple predictors, the model finds the optimal prediction of the outcome for the multiple predictors, taking into account their redundancy (correlation) with one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

uses of multiple regression:

A
  • prediction
  • theory testing
  • covariate control (assessing the effect of one predictor, controlling for the influence of the others)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

multiple regression linear model equation:

A

yi = β0 + β1x1 + β2x2 + ϵ

for each additional predictor (x) we have an additional β coefficient

interpretation:
- β0 = the predicted y value when all Xs are 0
- β1 = partial regression coefficient = change in y for one unit change in x1 when all other Xs are held constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is meant by ‘holding constant’?

A

refers to the effect of the predictor when the values of all other predictors are fixed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

3 ways to evaluate our linear model:

A
  1. evaluating the significance of individual effects
  2. evaluate the overall quality of the model
  3. evaluating model assumptions
19
Q

evaluating the significance of individual effects

A

this is basically hypothesis testing. the steps are:
1. good research question
2. create hypothesis from the question (remember we only ever test our null hypothesis)
3. define the null (usually β1 = 0 as that means x has no effect on y)
4. choose significance level
5. calculate test statistic for β coefficient: t = ^β / SE(^β)
6. evaluate the t-statistic against the null (using p-values or critical values)

20
Q

standard error of the slop - SE(^β) equation

A

SE(^β) = square root of (SSresidual / n-k-1) / (sum of (x - mean of x)^2 *multiple correlation coefficient of the predictors)

21
Q

SE is smaller when:

A
  • residual variance is smaller
  • sample size is larger
  • with less predictors
  • when predictors are not correlated with other predictors (R^2xj)
22
Q

confidence intervals for β coefficients (slope)

A

^β1 +/- t-stat * SE(^β)

to know if our variable is statistically significant, our null hypothesis value must not be contained within the confidence intervals (this is usually 0)

23
Q

evaluating overall model quality

A

the aim of linear models is to explain y as a function of x -in reality x does not account for all the variance in y, leaving us with residual vairance

24
Q

sums of squares

A

we can breakdown variation in our data based on sums of squares:
SStotal = SSmodel + SSresidual

this means total varaition in y = variation in our model + residual variance

25
Q

coefficient of determination - R^2

A

R^2 qantifies the amount of variance in our model that is accounted for by our predictors. it is presented as a decimal/percentage and the more variability accounted for, the better

equations:
R^2 = SS model / SS total
R^2 = 1 - SSresidual/SStotal

26
Q

total sum of squares (SStotal)

A

SStotal = (observed y value - mean of y) squared and then added up for all observations

SStotal is the squared distance of each data point from the mean of y (since the mean is our baseline/best guess of what y should be)

27
Q

model sum of squares (SSmodel)

A

SSmodel = (model estimated y value - mean of y) squared and the added up for all observations

it measures the distance from the model predicted line to the line for mean of y

28
Q

adjusted R^2

A

this is used when our linear model has more than 2 predictors as it adjusts for n and k. with more predictors there is more chance of random sampling fluctuation which has an effect on R^2

equation:
adjusted R^2 = 1 - (1 - R^2) * (n-1)/(n-k-1)

29
Q

what is a F-test?

A

F-tests test the significance of the overall model as a whole by testing the significance of the F-ratio

30
Q

what is the F-ratio?

A

the F-ratio is the ratio of explained to unexplained variance. F-ratio tests the null hypothesis that all regression slopes (model lines) will be 0 as this means our predictors tell us nothing about the outcome.

if our predictors do explain some variance, our F-ratio will be significant - bigger F-ratios indicate better models. When the null is true, the F-ratio will be close to 1 so we want it to be >1 so there is more model than residual variance

equation:
F = MSmodel / MSresidual

31
Q

what are mean squares?

A

mean squares are sums of squares calculations divided by the associated degrees of freedom

32
Q

residual degrees of freedom

A

= n - k - 1

as these are based off our model in which we estimate k beta terms ( = -k) and the intercept ( = -1)

33
Q

total degrees of freedom

A

= n - 1

in order to estimate y, all but one value of y are free to vary, hence -1

34
Q

model degrees of freedom

A

= k

as its dependent on our beta estimates hence k

35
Q

evaluating F-ratios

A

an f-ratio is evaluated against an f-distribution using dfModel and dfResidual, our alpha level and our critcal values

36
Q

what are degrees of freedom

A

the maximum number of logically independent values which have freedom to vary in the data sample

37
Q

standardising β coefficients

A

standardising allows us to compare the effects of variables on arbitrary scales or scales with differing units -but be careful, for regression, unstandardised coefficients are often more useful

equation:
^β*i = ^βi * (Sx/Sy)
standardised β = estimated β * (sd of x / sd of y)

38
Q

z-scoring β coefficients

A

method of standardising for continuous variables that transforms the IV and DV into z-scores (mean = 0, sd = 1) prior to fitting the model

equation
zx = xi - mean of x / Sx
translation = divide the deviation from the mean by the standard deviation (for both x and y values)

39
Q

interpreting standardised regression coefficients

A

R^2, F-test and t-test remain the same
our beta coefficients will change
e.g. β0 = 0 when all variables are standardised
β1 = increase in y (in sd units) for every sd increase in x
standardised slope = correlation coefficient (r)

40
Q

what are categorical variables

A

variables that can only take discrete values that are mutually exclusive. a binary variable is a type of categorical variable with only 2 levels.

41
Q

what is dummy coding?

A

binary variables are coded as 0 and 1 and often referred to as dummy variables - when we have multiple dummies we use the general procedure of dummy coding.

one level is chosen as a baseline and all other levels are compared to this baseline. for any categorical variables we create k-1 dummy variables.

42
Q

interpretation of dummy coding coefficients

A

β0 = expected value of y when x is 0 this is the mean of our baseline level
β1 = the predicted difference between the means of the two groups

this interpretation becomes more complicated as we get more predictors

43
Q

steps in dummy coding:

A
  1. choose a baseline variable
  2. assign everyone in the baseline group 0 for all k-1 dummy variables
  3. assign everyone in the next group a 1 for the first dummy variable and 0 in all others
  4. repeat step 3 until all k-1 dummy variables have a 0 and 1 assigned
  5. enter the dummy variables into your regression
44
Q

dummy coding results interpretation example:

A

test score based on 3 method of revising: re-read (baseline), summarise notes or self test

β0 = mean of re-reading
β1 = difference between mean of summarise and intercept
β2 = difference between mean of self test and intercept