Week 3: Regression Flashcards
What is needed for simple linear regression? - (3)
- What sort of measurementt = DV
- How many predictor = 1
- What type of predictor variable = continous
Regression is a way of predicting things you have not measured - (2)
Predicting an outcome variable from one predictor variable. OR
Predicting a dependent variable from one independent variable.
Rgeression predicting variable y from
variable x
In regression
when we know that x should
influence y (insttead of y influencing x)
Regression used to create a linear model of relationship between
two variables
Regression has the different to correlation as it adds a
constant bo
In regression we create a model to predict y - (2)
Regression equation
In regression we test how good the model we created to predict y is
good at fittting the data
The straight line equation in regression model has 2 parameters - (2)
The gradient (describing how the outcome changes for a unit increment of the predictor)
The intercept (of the vertical axis), which tells us the value of the outcome variable when the predictor is zero
Yi in regression equation means
outcome variable e.g., album sales
b0 in regression equation means
intercept
biXi in regression equation means - (2)
Regression coefficient for predictor
e.g., direction and strength of the relationship between advertising budget & album sales
εi in regression equation means - (2)
eror
e.g., Error album sales not explained by advertising budget
biXi in regression equation means - (2)
Predictor variable
e.g., advertising budget
Example of using simple linear regression equation to predict values - (3)
- Imagine you want to spend £5 on advertising – you would then get this equation here –
- so based on your model we can predict that if we spend £5 on advertising, we will sell 550 albums (the error term).
- What is left shows this prediction will not be perfect as there is always a margin for error. Your outcome variable is also known as a predicted value in a regression.
The closer the Sum of Squares of the model (SSM) is to the total sum of squares SST) to the data,
the better the model accounts for the data and smaller residual sum of squares (SSR) must be
Formula of Sum of Squares of Model (SSM)
SSM =SST (total) - SSR (residual)
SST (total) uses the
difference between the observed data and the mean value of Y
Sum of squares (residual) uses the difference between the
observed data and the model
Sum of squares model uses the difference between
mean value of Y and the model
In simple linear regression, R^2 is the proportion of variance in DV (outcome variable; Y) that is explained
by IV (predictor variable X) in regression model
The R squared (Pearson’s Correlaiton Coefficient squared) is the
coefficient of determination
R^2 gives you overall fit of model thus
model summary
Adjusted R squared tells
how well R squared generalises to population
Adjusted R squared indicates how well a
predictor variable explains the variance in the outcome variable, but adjusts the statistic based on the number of independent variables in the model.
Adjusted R squared will always be lower or equal to R^2 value because..
It’s a more conservative statistic for how much variance in the outcome variable the predictor variable explains
. If you add more useful variables, adjusted r-squared will
increase
If you add more and more useless variables in mode, what will happen to adjusted R squared?
adjusted R squared will decrease
How to calculate R squared in simple linear regression?
SSM (SST - SSR)/SST
R squared gives raio of
explained variance (SSM) to total variancw (SST)
The F ratio tests if the line is better than the mean meaning if
overall model (fitted regression line) is a good fit
What is the mean squared error? (3)
- Sum of squares (SS) are total values
- Can be expressed as averages
- These are called Mean squares MS
What is SSM divided by DF gives
mean sum of squares (MSM)
SSR divided by DF gives
Mean sum of residuals (MSR)
What is calculation of F raito?
MSM/MSR
F ratio measures the ratio of
MSM to MSR
If model is good then F ratio will
account for a large portion of variance MSR as compared to what is left - residuals MSR
Diagram of SSM/SSR/SSR labelled for ANOVA
The DF in SSM/DF represents
number of variable in the model
The DF in SSR/DF represents
number of observation minus number of parameters
What is line coefficients in this output of regression for model parameters of regression line?
- Line coefficients is intercept B0 and slope bi
What is bi ( slope) in this regression output for model parameters of regression line?
- change of outcome associated with a unit change in predictor
What is standard error in this regression output for model parameters of regression line
indicates how far off you would be, on average, if you were to use the independent variable and th model to predict scores on the dependent variable
Where is beta in output of simple linear regression for model parameters of regression line?
beta = r
standardised coefficient gives correlation coefficient in simple regresion
What does this part of simple linear regression output show for model parameters of regression line?
t statistic and associated p-value
Regression line looks at the
variance that we cannot explain vs variance we can explain with model
Assumptions of linear regression - (7)
- Variale type = outcome must be continous and predictors can be continous or dichotomous
- Non-zerio variance - predictors must not have zero variance
- Independent = all values of outcome should come from different person
- Linearity = relationship we model is in reality linear
- Hommoscedasticity -> for each value of the predictors the variancw of the error term should constant
- Independent Erros: For any pair of observations, the error terms should be uncorrelated (see Durbin-Watson test)
- Normally distributed errors
Diagram of non-linear in simple linear regression
Diagram of good vs bad homoscedasticity on plot of residual values (veritcal axis) and predicted
Good = all data points occupy all four quarrters of plot
Bad = residuals look like a cone
What is homosecedasticity mean?
“having the same scatter.”
Diagram of heterodasticity and homoscedasticity
Another Diagram of homoscedacity and heterodasticity
hetrodasticity –> points higher on x axis have larger variance than smaller ones , points are at widely varying distances from regression line
Diagram of good and bad of normality of errors = frequency histograms of residuals
Bad histogram = positively skewed
Correlation does not mean
causation e.g., even if tthey make sense
Spurious correlations can occur when an
unknown variable could drive the effect
Example that correlation does not mean causaiton
relationship between the predictor variable - visits to the pub - and the outcome variable - exam score. There is a correlation between the two variables, but would we really think that more visits to the pub would cause better exam performance? Perhaps there was a third variable that might explain the link? Maybe there was a support session on statistics that was held between 4 and 5pm in a building next to a Pub?
Example of simple linear regression quesiton is
Does poverty levels predict the number of teen births?
Example of simple linear regression:
Does poverty levels predict the number of teen births?
What is x and y? - (2)
x = poverty rate, which is the percent of the state’s population living in households with incomes below the federally defined poverty level.
y = year 2002 birth rate per 1000 females 15 to 17 years old
Example of simple linear regression:
Does poverty levels predict the number of teen births?
x = poverty rate, which is the percent of the state’s population living in households with incomes below the federally defined poverty level.
y = year 2002 birth rate per 1000 females 15 to 17 years old
What is H0 and H1? - (2)
H0: The slope equals 0, i.e. poverty levels do not predict teen birth rate
H1: The slope is different than 0, i.e. poverty levels predict teen birth rate
Example of simple linear regression:
Does poverty levels predict the number of teen births?
x = poverty rate, which is the percent of the state’s population living in households with incomes below the federally defined poverty level.
y = year 2002 birth rate per 1000 females 15 to 17 years old
What does its fittted model y = 4.267 + 1.373x mean? - (2)
The slope (Β1= 1.373) indicates that the 15 to 17 year old birth rate increases 1.373 units, on average, for each one unit (one percent) increase in the poverty rate.
The intercept (B0: =4.267) means that if there were states with poverty rate = 0, the predicted average for the 15 to 17 year old birth rate would be 4.267 for those states.
Correlation vs regression
In correlation - (5)
- Does not imply causation
- All we can say is that two variables are related/associated
- X and y can be swapped
- One outcome value
- No regression line on scatterplots!
Correlation vs regression
In regression - (4)
- Independent variable influences the dependent (outcome) variable
- X and y cannot be swapped!
- Has a model: equation to allow predictions outside of current measurements
- Regression line of the model on a scatterplot
If p in SPSS output is 0.000 we report as
p < 0.001 as p is never 0
What does this SPSS simple linear regression output show?
Our model is significantly better at predicting the data than the null model (F (1, 118) = 729.43, p<.001) and explains 86% of the variance in our data (R2=.86)
What does this SPSS output show? - (2)
y = 3.19 x + 391.67
Our model is significantly better at predicting the data than the null model (F (1, 118) = 729.43, p<.001) and explains 86% of the variance in our data (R2=.86). For every 1 unit increase in alcohol there is a 3.19 increase in break reaction time (B = 3.19, t = 27.01, p<.001)
Which of the following statements about Pearson’s correlation coefficient isnottrue?
A.It can only be used with continuous variables
B.It can be used as an effect size measure
C.It varies between –1 and +1
D.A correlation coefficient of zero indicates there is no relationship between the variables
A - biserial and point biserial correlation , Pearson correlaiton coefficient can be usd witth binary and ccategorical variables
A psychologist was interested in whether the amount of news people watch (minutes per day) predicts how depressed they are (from 0 = not depressed to 7 = very depressed). What does the standardized beta tell us in the output?
A - As news exposure decreases by 0.224 standard deviations, depression increases by 1 standard deviation
B - As news exposure increases by 1 minute, depression decreases by 0.224 units
C – As news exposure decreases by 0.224 minutes, depression increases by 1 unit
D - As news exposure increases by 1 standard deviation, depression decreases by 0.224 of a standard deviation
D - standardised beta coefficient of -0.244 for news exposure
A psychologist was interested in whether the amount of news people watch predicts how depressed they are.
In this table, what does the value 4.404 represent?
A - The ratio of how much the prediction of depression has improved by fitting the model, compared to how much variability there is in depression scores
B - The ratio of how much error there is in the model, compared to how much variability there is in depression scores
C - The proportion of variance in depression explained by news exposure
D - The ratio of how much the prediction of depression has improved by fitting the model, compared to how much error still remains
D
We don’t know the overall variability, but only the error. The other options are wrong because we do not know how much variability there is in depression score. We don’t measure the variability of the population, but only of the observed one, and this is the sample. If, instead of closing with “depression score” there was the specification “of the sample”, then this would have been correct.
The coefficient of determination:
A.Is the square root of the variance
B.Is a measure of the amount of variability in one variable that is shared by the other variable
C.Is the square root of the correlation coefficient
D.Indicates whether the correlation coefficient is significant
B
The proportion of the variation in the outcome variable (Y) that is predictable from the predictor variable (X).
A measure of how much variability in one variable can be “explained by another”.
R² shows how well terms (data points) fit a model curve or line.
An R² value of 0.78 indicates that 78% of the variation in Y is determined by the relationship between Y and X.
The correlation beween 2 variables A and B is 0.12 witth significance of p < 0.01.
What can we conclude?
A. there is a small relationship between A and B
B. There is a substantial relationship between A and B
C. That variable A causes variable B
D. That variable A causes variable B
A -
+/- 0.1 represents small, +/- 0.3 represents medium and +/- medium effect
The table below contains scores from 6 people on 2 different scales that measure attitude towards reality TV showers
Using scores aabove, the scales are likely to
A. correlate positively
B. correlate negatively
C. be uncorrelated
A - high scores on one scale tend tto produce high score son other and low scores on one also correspond with low socres in another
A Pearson’s correlation coefficient of -0.5 would be represented by a scatterplot which
A. There is a moderately good fit between theregression line and the individual data points onthe scatterplot
B. Half of the data points sitt perfectly on the line
C. Regression line slopes upwards
D. Dtaa cloud looks like a circle and regression line is flat
A
If two variables are significantly correlated r = 0.67
then..
A. share variance
B. No unique variance
C. Relationship is weak
D. Variables are independent
A , not D as variables correlated are not independent
What do the results in table below show?
A. In a sample of 100 people, there was a strongnegative relationship between work productivityand time spent on Facebook, r = –.94, p < .001
B. In a sample of 100 people, there was a weaknegative relationship between work productivityand time spent on Facebook, r = –.94, p < .001
C. In a sample of 100 people, there was a non-significant negative relationship between workproductivity and time spent on Facebook, r = –.94,p < .001
A
A Pearson’s correlation of –.71 was found between number of hours spent at work andenergy levels in a sample of 300 participants. Which of the following conclusions can bedrawn from this finding?
A. There was a strong negative relationship betweenthe number of hours spent at work and energylevels
B. Spending more time att work caused participants to have less energy
C. Amout of time spent at work accounted by 71% of variance in energy levels
D. Estimate of the correlation will be impreicse
A
Example of simple linear regression:
A child psychologist was interested in whether playing video games was associated with child aggression.She collecteddata on 666 childrenand adolescents.She recorded how long each week children spent time playing video games, and then rated how aggressive the children were in a social situation.
The variables are:
VideoGames: hours spent playing video games per week
Aggression: Rating of child aggressiveness (higher scores indicate increased aggression
Report R^2 to 2 DP
Report DF
Report p value
Report Adjusted R squared - (4)
- R squared = 0.03
- DF = 1,664 (DF of regression, DF of residual)
- P-value = < 0.001
- Adjusted R squared = 0.02
The p -value for F statistic in simple linear regression tells whether ….
The p-value for t statistic in simple linear regression tells us whether… - (2)
- overall proportion of variance in outcome explained by predictor is significant
- Used to calculate significance of predictor