Week 18- Developing the linear model (content) Flashcards
How do you estimate the association between 2 variables? (one outcome and one predictor)
model <- lm(mean.acc ~ SHIPLEY,
data = clearly.both.subjects)
summary(model)
How we estimate the association between multiple variables: One outcome and multiple predictors
model <- lm(mean.acc ~ SHIPLEY + HLVA + FACTOR3 + AGE + NATIVE.LANGUAGE,
data = clearly.both.subjects)
summary(model)
((((basically just add all the variables together)))))
Do we assume normal distribution in residuals and predictor variables?
Yes
What is a residual?
It is the difference between observed and predicted outcomes
How accurate are models in terms of residuals?
Better models should show smaller differences between observed and predicted outcome values
Are residuals normally distributed?
Yes- balance between + and - predictor errors
significance of the garden of forking paths in data analysis
Describes how scientists can make false discoveries when they do not pre-specify a data analysis plan and instead choose “one analysis for the particular data they saw.
understanding the linear model with multiple predictors
y=β0+β1x1+β2 x2+⋯+ϵ
> The intercept β_0 plus
The product of the coefficient of the effect of e.g. AGE β_1 multiplied by x_1 a person’s age +
+ any number of other variables +
The error ϵ: mismatches between observed and predicted outcomes
Identifying the key information in linear models with multiple predictors
> Coefficient estimate = Estimate
Standard error = Std.Error
t value = t value
p-value = Pr(>|t|)
R squared = adjusted r-squared (tends to be more accurate)
F-statistic
what is r-squared
Indicates how much outcome variation we can predict, given our model
What is the F-statistic
To find out if the variables are significantly related
What are the uncertainties when looking at sample data
> The nature of the expected change in outcome
The ways that expected changes might vary between individual participants or between groups of participants
The random ways that specific responses can be produced
What do uncertainties in data allow us to do
Have to carefully think about conclusions we draw
3 assumptions about data analysis
Validity, measurement, generalisability
Important- is everything some sort of linear model?
Yes! Most common statistical tests are special cases of linear models, or are close approximations