Data Analysis - Interpretation Flashcards

1
Q

What is the take home message regarding any form of models?

A

ALL models are bad but some are useful for specific things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What 4 things does the statistical doctor look for?

A
  1. observe
  2. guess
  3. test
  4. assess
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can be observed from this scatterplot?

A
  1. hypertension and CHD values appear to increase together
  2. they have a correlation of 0.70

(the red line is linear regression analysis - there is no line for correlation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do these diagrams demonstrate the difference between correlation and regression?

A

correlation indicates whether two variables do or don’t change together

regression quanitifies how the variables change together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is significant about the slope of a linear regression model?

How is it calculated?

A

slope = change in height/change in horizontal distance

the slope of a linear regression model quantifies how the variables relate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the number highlighted in red demonstrate?

A

the coefficient

the slope of the line that is used as a model to represent the relationship between CHD and hypertension

the interpretation is that for every step we take in hypertension, our CHD changes by 0.32

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a better interpretation of the “slope”?

A

for every 1% difference in the prevalence of hypertension, we see a 0.32 difference in the prevalence of CHD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is it important to be careful with presentation when showing a relationship between 2 variables?

A

2 graphs may have the same slopes but look different due to different scales being used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is R squared?

What is the interpretation and meaning?

A

it is a goodness of fit statistic with values between 0 and 1

interpretation:

the larger the value, the better

meaning:

proportion of the outcome’s variability that the model explains

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In which ways is the data limited, when the conclusion about the relationship between the prevalence of CHD and hypertension is made?

A

the data used is only from one year, only from England and only from patients

statements cannot be made outside of the year or country that the data is taken from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is the population not usually studied?

A

the group that we are most interested in is the population

the population is impossible to study, so samples are studied and the results are generalised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What 4 questions must be asked when thinking about generalising?

A
  1. how reliable is my model?
  2. would i get the same coefficient if i built my model using different data?
  3. would i get the same goodness-of-fit if i used the same model on different data
  4. how likely am i to make the correct conclusion?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly