Module 4 Flashcards
What do the values have to be between for Pearson’s Correlation coefficient?
-1 and +1
values f r close to this indicate a strong linear association
What does a r value close to 0 indicate?
- little linear association between variables
What are the hypothesis for Pearson’s Correlation?
H0: p=o (no linear association)
H1: p not equal to 0 (linear association
What p-value shows a significant linear correlation?
p<0.05
What are the assumptions of a Pearon’s correlation coefficient?
- linear association
What test is used if there is no linear association between two variables?
- Spearman’s (rank) correlation coefficient
- require association to be monotonic
Define Monotonic?
- always increasing or always decreasing (but doesn’t have to be at the same rate
Does a correlation between 2 variables mean there is a cause and effect relationship?
- no
- there may be an unobserved variable that can this
What does correlation measure?
- magnitude of the association between 2 variables
What does regression measure?
- magnitude of dependence of one variable upon another
What is the idea of linear regression?
- find relationship between the independent (x) and dependent (Y) variable
- want to determine the straight line that best ‘fits’ the data
Can you have more than one independent variable for regression?
- yes
What is the linear regression model formula?
Yi=Bo+B1 Xi + Ei
What are the three main steps in regression analysis?
- estimate equation (find coefficients)
- assess model (significance and assumptions)
- use good model to make predictions
In Rcomander what is the Bo and B1?
- Bo is the (intercept) under estimate Std.
- B1 is the value under this
What is b1?
- regession coefficient (slope of line)
What is Bo?
- y-intercept
- the value of Y when X=0
What are the assumptions for regression?
- Y and X are linearly related
- the values of Y are independent from each other
- the random part of Y (error) is normally distributed around 0 with constant variance
What is the residual?
- is the difference between what our model predicts at a given value of x and what we observe
What are the assumptions for residual analysis?
- normally distributed
- mean of zero
- constant variance (homoscedasticity)