correlation and regression Flashcards
what may be related to each other? - give an example
- two datasets may be related
e.g., height, weight
when can you see the relationship of the datasets?
- when you look at them on a graph
what were the first statistics invented for?
- for analysing co- relationships
when is there probably a mistake in data?
- if your data shows a perfect straight line
- if there’s more than one datapoints a long way away from all the others
when might data be worth checking for mistakes?
- if there’s no relationship at all between things you really expect to be related
what is the definition of correlation?
- finds the best fit line by minimising the difference between the data and line
what does a correlation report about a relationship?
- strength and direction of a relationship
what is a residual?
- difference between an observed value and a predicted value in regression analysis
what is a zero correlation?
- no relationship between the variables
- cluster of data points
what is a positive correlation?
- relationship between two variables that tend to move in the same direction
what is a negative correlation
- two individual variables generally move in opposite directions
what would you do to get the line of best fit?
- could try adjust the line manually but wouldn’t be the best fit
- need to use maths instead
what equation allows you to work out the line of best fit?
r = Sxy/ Sx.Sy
what does Sxy stand for?
- how much x and y change together
what is Sx. Sy?
- how much x and y change separately
what is the equation to work out r?
n/i = 1 (xi-x)(yi-y) / square root of n/i= 1 (xi-x)^2 square root of n/i= 1 (yi- y) ^2
what two aspects does a R value tell you?
- direction
- strength
what value is r when the correlation is positive?
- if r is above 0
1 > r > 0
what value is r when the correlation is negative?
- r is below zero
-1 < r < 0
what is the value of r when the correlation is strong?
- if r is close to one
r +/- 1
what is the r value when the correlation is weak?
- r is close to zero
r- 0
when are r values especially useful?
- useful for values in the middle e.g., - 0.4 to 0.4