Week 18- Developing the linear model (content) Flashcards

1
Q

How do you estimate the association between 2 variables? (one outcome and one predictor)

A

model <- lm(mean.acc ~ SHIPLEY,
data = clearly.both.subjects)
summary(model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How we estimate the association between multiple variables: One outcome and multiple predictors

A

model <- lm(mean.acc ~ SHIPLEY + HLVA + FACTOR3 + AGE + NATIVE.LANGUAGE,
data = clearly.both.subjects)
summary(model)

((((basically just add all the variables together)))))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Do we assume normal distribution in residuals and predictor variables?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a residual?

A

It is the difference between observed and predicted outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How accurate are models in terms of residuals?

A

Better models should show smaller differences between observed and predicted outcome values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are residuals normally distributed?

A

Yes- balance between + and - predictor errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

significance of the garden of forking paths in data analysis

A

Describes how scientists can make false discoveries when they do not pre-specify a data analysis plan and instead choose “one analysis for the particular data they saw.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

understanding the linear model with multiple predictors

A

y=β0+β1x1+β2 x2+⋯+ϵ

> The intercept β_0 plus
The product of the coefficient of the effect of e.g. AGE β_1 multiplied by x_1 a person’s age +
+ any number of other variables +
The error ϵ: mismatches between observed and predicted outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Identifying the key information in linear models with multiple predictors

A

> Coefficient estimate = Estimate
Standard error = Std.Error
t value = t value
p-value = Pr(>|t|)
R squared = adjusted r-squared (tends to be more accurate)
F-statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is r-squared

A

Indicates how much outcome variation we can predict, given our model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the F-statistic

A

To find out if the variables are significantly related

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the uncertainties when looking at sample data

A

> The nature of the expected change in outcome
The ways that expected changes might vary between individual participants or between groups of participants
The random ways that specific responses can be produced

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do uncertainties in data allow us to do

A

Have to carefully think about conclusions we draw

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

3 assumptions about data analysis

A

Validity, measurement, generalisability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Important- is everything some sort of linear model?

A

Yes! Most common statistical tests are special cases of linear models, or are close approximations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

ANOVA as a linear model

A

yi=β0+β1X+β2Z+β3XZ
> If you have a 2 x 2 factorial design, with two factors factor.1, factor.2, and a dataset with variables X, Z coding for group membership
> Then the mean outcome for baseline conditions =β0
> The estimates of the slopes β_1,β_2 tells about the average difference between groups
> The estimate of the slope β_3 tells us about the interaction
> And we can code the model like this: lm(y ~ factor.1factor.2)
> Or this Anova(aov(y ~ factor.1
factor.2, data), type=’II’)

17
Q

Extensions to the linear model- outcome predictors and errors

A

> outcome can generalize to analyse data that are not metric, do not come from normal distributions
predictors can be curvilinear, categorical, involve interactions
error can be independent; can be non-independent

18
Q

Extensions to the linear model- binary or dichotomous outcomes

A

> Binary outcomes are very common in Psychology: yes or no; correct or incorrect; left or right visual field etc.
The change in coding is e.g. glm(ratings ~ predictors, family = “binomial”)

19
Q

Extensions to the linear model- non-independence of observations

A

> Much – maybe most – psychological data are collected in ways that guarantee the non-independence of observations
e.g. We test children in classes, patients in clinics, individuals in regions
e.g. We test participants in multiple trials in an experiment, recording responses to multiple stimuli
These data should be analysed using linear mixed-effects models (Meteyard & Davies, 2020) which we study at MSc

20
Q

SUMMARY- KEY POINTS

A
  • Linear models are a very general, flexible, and powerful analysis method
  • We can use assuming that prediction outcomes (residuals) are normally distributed
  • With potentially multiple predictor variables
  • When we plan an analysis we should try to use contextual information – theory and measurement understanding – to specify our model
  • When we critically evaluate our or others’ findings, we should consider validity, measurement, and generalizability
    . When we report an analysis, we should report:
    . Explain what I did, specifying the method (linear model), the outcome variable (accuracy) and the predictor variables (health literacy, reading strategy, reading skill and vocabulary)
    . Report the model fit statistics overall (F,R^2)
    . Report the significant effects (β,t,p) and describe the nature of the effects (does the outcome increase or decrease?)