Week 7 Regression Flashcards
To provide an overview of the lecture on Regression
What is the correlation coefficient?
*the index of the degree of association between two variables, typically Pearson r or related product-moment correlation
What do Bivariate regression and Multiple Regression do, and how do they differ?
- Bivariate regression allows for the prediction of one variable from another variable
- Multiple regression is an extension of bivariate correlation, where the relationship is determined between a single DV (criterion) and multiple predictors
Why would you use correlational research?
*Some variables that do not lend themselves to an experimental design, such as personality traits, sex, etc and these are of interest to behavioural scientists. *However, most variables that cannot be studied experimentally can be studied correlationally
Remind me; what is a positive, negative and neutral relationship?
Positive relationship: two variables that move in the same direction, e.g., generally, as your height increases so too would your weight.
- Negative relationship: two variables that move in opposite directions, e.g., alcohol intake & driving skill
- Neutral relationship: (flat line?) no relationship e.g. driver’s shoe size & the number of kilometres travelled.
What exactly does Bivariate or Linear Regression achieve?
- bivariate regression enables an equation to be developed to predict one variable (Y) from the other (X)
- The regression coefficient is the value by which the score on a predicted variable is multiplied to predict the score on the criterion variable
So a correlation coefficient measures the correlation by identifying the strength of the association between two variables. However, different forms of measurement, for example, nominal, ordinal and scales require different analytical techniques.
What choices do I have?
- Continuous measures (ratio or interval) generally use Pearson’s r as it is the most common parametric correlational analysis that measures the direction and strength of a linear relationship.
- Spearman’s rho and Kendall Tau are correlational analyses that measure ordinal or ranked (nominal) level data
Why is bivariate regression easier to interpret than multiple regression?
- Bivariate regression is easier to interpret than multiple regression as bivariate = only 2 variables
- in Multiple Regression if there is a degree of inter-correlation among the IVs can lead to ambiguous output
How are Regression relationships best represented?
- Relationships are best shown by a scatterplot
- Only when a linear relationship exists can you use SPSS to do further correlational &/or regression analysis
- Venn diagrams best visually represent the strength of the prediction in a relationship.
- The Coefficient of Determination is r2, which indicates the proportion of variance in one variable predicted by the other
My IVs are not inter-correlated, how do I interpret my output?
- When IV‘s are uncorrelated then simply add their individual coefficients
- to find R2 one simply adds the r2 values for each IV
So when my IV’s are not intercorrelated I can simply add their individual coefficients; what happens when they are intercorrelated?
- when correlated the overlap becomes problematic because it is no longer just an additive process
- we only look at unique variance explained in each IV in Multiple Regression.
If I have inter-correlated IVs in multiple regression, is there another way to interpret the output?
- Venn diagrams best visually represent the strength of the prediction in a relationship
- look at the residuals which are the difference between the actual Y score & predicted Y score.
- The larger the R, the smaller the combined residuals
When do we use inferential statistics?
- When we want to draw inferences about populations from information available in samples.
- We can describe the relationship (association) between 2 variables through the direction and strength of those 2 variables.
- Inferential statistics are used when we want to test a hypothesis about that relationship by looking at the degree of association and comparing it to a critical value in a table which is inferred to the population.
- The hypothesis is tested to assess whether a significant relationship is identified between 2 given variables.
I have heard there are different terms instead of DV, IV, and so on; what are some of these terms?
- IV are called predictor variables
- DV is known as the criterion variable
*Remember Multiple Regression is NOT causal
What is r, R, R2?
- Bivariate correlation coefficient = (r)
- Bivariate regression = (R)
- Multiple correlation is big R & represents a linear combination of predictors ascertaining a line of bit fit
- SMC (Squared multiple correlation) Proportion of Variance explained = R2.
- R2 gives the predictive value of the analysis (between 0 - 1)
What are the assumptions of Regression & other correlational designs?
- Curvilinear relationships – check prior to analyses
- Sample size
- Outliers – univariate, multivariate (Mahalanobis distance, & Casewise Diagnostics)
- Normality, linearity, homoscedasticity, homogeneity of variance
- Homogeneity of variance-covariance
- Multicollinearity and Singularity
- Care needs to be taken with interaction terms (Centre your data in this case)