2.2 Correlation - Topic 2 Data Presentation and Interpretation Flashcards
1
Q
Bivariate data:
A
data which has pairs of values for 2 variables
2
Q
How can you present bivariate data?
A
on scatter diagram e.g. how breath rate affects pulse rate
3
Q
Independent/explanatory variable:
A
- variable that is changed in the experiment
- on x-axis
4
Q
Dependent/response variable:
A
- variable that is measured
- on y-axis
5
Q
Use of interpolation:
A
- if independent variable is known, regression line can be used to make prediction or estimate of corresponding value of dependent variable
- prediction must be within the range of the given data
- = interpolation
6
Q
Dangers of extrapolation:
A
- gives much less reliable estimate than interpolation
7
Q
What is linear regression?
A
- line of best fit - linear model that approximates the relationship between the 2 variables
- least squares regression line - minimised sum of the squares of the distances of each data point from the line
- regression line of y on x is written as y = a + bx
- if b = +ve data will be +vely correlated
- if b = -ve date will be -vely correlated
8
Q
Types of correlation:
A
described in terms of positive, negative, zero, strong and weak
9
Q
What can correlation tell us?
A
- correlation does not always imply causation
- 2 variables can have a casual relationship (where change in 1 variable causes a change in the other)
- need to look at context of question to determine if they have a casual relationship
10
Q
When is a regression line more valid?
A
the stronger the correlation, the more accurately the regression line will model the data
11
Q
When are regression lines different?
A
- order of variables is important - regression line of y on x will be different from regression line of x on y
- normally only make predictions for dependent variable
- if making predictions for dependent variable need to use regression line of y on x
- if making predictions for independent variable need to use regression line of x on y