Topic 4 (statistics) Flashcards
correlation
measure of linear relationship between 2 variables (straight line of best fit)
-only gives evidence of linear relationship
types of correlation
positive (stronger correlation stronger relationship between variables) (majority of data in 1st and 3rd quadrant)
negative (majority of data in 2nd and 4th quadrant)
zero (could still mean non linear relationship) (points in all 4 quadrants
Bivariate data
data for which each item requires the values of 2 variables
independent variable
explanatory variable
one that is set independently of other variable
plotted along x-axis
dependent variable
response variable
values are determined by values of independent variable
plotted along y-axis
spurious
correlation without a causal connection
causal
one variable has an impact on other
Product moment correlation coefficient (PMCC)
-measure of strength of correlation
PMCC
r= 1 perfect positive linear correlation (exact straight line with positive gradient)
r= -1 perfect negative linear correlation (exact straight line with negative gradient)
r=0 no linear correlation (may be relationship of some sort)
remains unchanged through coding (if coded using linear transformation to make data manageable PMCC is same)
correlation & regression:
- methods for investigating relationship between variables in statistics
correlation: measure to which 2 variables are related
regression: describing the relationship between 2 variables (eq of line of best fit for data)
interpolated value
sits within the plotted range of data
extrapolated value
sits outside range of plotted values
residual
compares the true value compared to the value on the regression line
- data value above regression line = residual >0
- data value below regression line = residual <0
perfect scenario:
sum of residuals = 0
sum of residuals squared = as close to 0 as possible
regression line
of y on x
y= a + bx
a= y int b = gradient (b>0 PMCC >0). (b<0 PMCC <0)
interpolation
estimating dependent variable within range of data