Week 6: Linear Regression and Correlation Flashcards

1
Q

Which commands can be used to explore variables?

A

<codebook>
<histogram>
<summarize varnames, detail>
</histogram></codebook>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the independent and dependent variables in the association between age and executive function?

A

Age (independent)
Executive function (dependent)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What command is used for linear regression?

A

<regress>
</regress>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the intercept (β0) represent in regression analysis?

A

The predicted value of the dependent variable when the independent variable is 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the equation for simple linear regression?

A

y = β0 + β1x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you centre a variable around the mean?

A

<sum varname, meanonly>
<gen varname_cent = varname - r(mean)>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is the Pearson correlation coefficient calculated?

A

<correlate> or <pwcorr>
</pwcorr></correlate>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the interpretation of a Pearson correlation coefficient of -0.43?

A

A moderate negative association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the formula to calculate the CI for a slope (β1)?

A

β1 ± 1.96 x SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why should the regression equation not be used outside the range of observed data?

A

Because the relationship outside the range may not be linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the R2 value represent in linear regression?

A

The proportion of variance in the dependent variable explained by the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you compute a scatterplot?

A

<twoway(scatter varnames)>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you compute a line of best fit?

A

<twoway(lfit varnames>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you combine a scatterplot and line of best fit into one graph?

A

<twoway(scatter varnames) (lfit varnames)>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain what 𝛽0 means in the context of the association between age and executive function
𝛽0 = 29.33
𝛽1 = -0.185

A

As the constant essentially sets the baseline when age = 0, executive function is 29.33. In other words, 𝛽0 (_const) describes the mean executive function at intercept between x and y, i.e., at the age of 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is it meaningful to talk about individuals aged 0 in the context of the association between age and executive function?

A

No. To rectify this, we centre the exposure round the mean:
1. Find the mean of age and create a new variable (e.g., age_cent) where we subtract the mean from each value of x. Use the following commands:
<gen age_cent = age - r(mean)>
2. Draw a scatterplot and line of best fit using age_cent
3. The scale on the x axis should be different with data centred around the mean
4. 0 on the x axis now represents the mean age

17
Q

How do you do simple calculations in Stata?

A

<display> or <di> for short
</di></display>

18
Q

In the context of the association between age and executive function, after centring the exposure round the mean, how would you interpret the new _cons (𝛽0)>?
Mean age = 65
_cons = 17.26

A

When x = 65, executive function is 17.26

19
Q

How would you interpret a 95% CI of 0.2018 and 0.1691 and p-value of < 0.001
Context: Association between age and executive function
𝛽1: 0.18

A

We are 95% confident that the true decrease in executive function per additional year of age is between 0.1691 and 0.2018 points. The interval is narrow, suggesting the estimate of the slope is precise.
The p-value for 𝛽1 is < 0.001, thus there is strong evidence against H0. This suggests executive function declines with 0.18 per year increase in age.

20
Q

Outline the steps to computing a simple linear regression fitting sex against executive function:

A
  1. Check how sex is coded (male = 1, female = 2)
  2. Sex needs to be recoded so male = 0 and female = 1

<gen>
<recode sex01 (1=0) (2=1)>
<label define sex 0 "male" 1"female">
<label>
3. Visualise the data using this command:
<twoway(scatter var_names) (lfit var_names), xlabel (0 (1) 1)>
4. Fit a regression model using -
<regression>
</regression></label></gen>

21
Q

How do you compute a Pearson correlation?

A

<correlate>
or
<pwcorr>
</pwcorr></correlate>

22
Q

How do you compute a Spearman rank correlation?

A

<spearman>
</spearman>

23
Q

How do you formally interpret the results of a linear regression and correlation?

A

The results of a linear regression showed that (predictor) was a significant predictor of (outcome) (𝛽 = 𝛽1, 95% CI [a, b], p < 0.05). For each unit increase/decrease in (predictor), (outcome) increased/decreased by an estimated (𝛽1) points. The intercept (𝛽1) was (𝛽1) (95% CI [a, b], p < 0.05), representing the predicted (outcome) when (predictor) is equal to the mean (x). A Pearson correlation analysis indicated that the strength of the association was (strong/moderate/weak) (r)