week 7 Flashcards
What is covary
whether two variable covary
what does regression do
to predict values of one variable to another
How do you find a prediction
finding the regression line
What is x-axis
predictor/dependent variable
What is Y-axis
Outcome
What is simple regression equation
Y=bo+b1x+e
what is Y
outcome
what is bo
intercept: the point at which the regression line crosses the Y-axis the value of Yi when X=0
what is b1
slope of the line: a measure of how much Y changes as X changes, regardless of its sign the larger the value of b1, the steeper the slope. For 1 unit of change on the X axis how much changes is there on the Y axis
what is X
predictor
what is e
error: residual or prediction error. The difference between the observed value of the outcome variable and what the model predicts
What is regression
the line of best fit is. A line that best represents the data, a line that minimises residuals
Regression analysis
rather than only looking at relationships, we are interested in making predictions. If we know a participant’s score on X can we predict their value on Y
What is the key statistics in regression
R2 value, F value
How the variables relate to each other: The intercept, Beta values: the slope of the regression line
Residual sum of squares
residuals can be positive or negative, if we add the residual, the positive ones will cancel out the negative ones, so we square them before we add them up. We refer to this total as the sum of squared residual or residual sum of squares
SSr is a gauge of how well the model fits the data: the smaller SSr, the better fit
What is total sum of squares
using the sample mean of observed Y as a baseline model, assuming no relationship between Y and X
The sum of squared difference between the Y and the sample mean total sum of squares
In this baseline model: SSt=SSr.
Model sum of squares
Sum of squared difference between the Y and the sample mean
SSt= total variance
in the outcome variable can be partitioned into two parts
SSr = residual or error variance
variance not explained by the model
SSm= model variance
variance explained by the model
Regression statistic
R2= SSm/SSt
this provides the proportion of variance accounted for by the model
R2 values range between 0to 1. The higher the value , the better the model interpret r2 as a percentage
Regression statistic: F ratio
F=MSm/MSr
F is the ratio of explained variance to the unexplained variance, in other words, F is the ratio of how good the model is compared to how bad it is MSm should be larger than MSr
Overall test
The hypothesis in regression can be phrased in various ways
Null hypothesis predicted values of Y are the same regardless of the value of x
Does the model explain significant amount of variance in the outcome variable
Coefficients
Characteristics of the regression line: beta values: the slope of the regression line the intercept
Unstandardised beta
the value of the slope b1 for every one unit change in x
important to look at whether b1 is positive or negative
if b1 is 0 there is no relationship between X and Y
if a variable significantly predicts an outcome the b-value should be different from zero
This hypothesis is tested using t-test