lecture 5- multiple regression Flashcards
hierarchical regression
Hierarchical regression is a statistical technique used to examine the relationship between a set of predictor variables and a dependent variable while accounting for the influence of different levels of predictors in a stepwise manner. It is commonly used when researchers want to understand how much additional variance in the dependent variable can be explained by adding new variables or groups of variables (predictors) into the model, over and above what is explained by previous variables
adjusted R2
more predictors = better fit = more variance explained = higher r2
nuisance or suppressor variables
Regression can also be useful as a means of dealing with nuisance or suppressor variables: i.e., variables
that “add noise”, “obscure”, “hide” or “suppress” the relationship between the predictor variable you are interested in and the criterion variable.
Example: a neuropsychologist wants to test whether the levels of a neurotransmitter (neurotransmitter A)
explain the level of memory deficits in a neurological sample..
collinearity
Two predictors are said to be collinear if they are highly correlated with one another (r>.75).
In other words, they may be measuring the same construct.
→ it is difficult to estimate the independent contributions of each variable
collinearity : solutions
Unless you have a research question that specifically requires keeping both predictors, the best approach is
to drop one of these predictors from your model.
Or conduct a PCA (principal component analysis).
categorical predictors
The dependent variable (Y) in regression has to be continuous and normally distributed.
However, essentially any type of variable can be employed as a predictor (i.e., independent) variable.
Researchers often want to use, are or forced to use, categorical variables
Example: A researcher wants to look at predictors of political preferences. Among the various variables she
thinks may be important is area of residence.
* How can one use categorical data as a predictor?
* And how many categories can we use? (urban vs. rural; inner city vs. suburban vs. rural)
categorical predictor solution
The solution is to use dummy coding.
To code category membership where k = number of categories, you need k - 1 dummy variables:
* 2 categories (e.g., urban/rural; m/f) → you need 1 dummy variable (the levels are 1 or 0)
* 3 categories (e.g., inner city/suburban/rural; m/f/other) → you need 2 dummy variables
non linear relationships
Linear regressions only provide a valid measure of the relationship between two variables when that
relationship is linear (when it can be described by a straight line)
However, in real life relationships between variables are often not linear
For example, changes in cognitive performance across the lifespan will typically show an inverted U shape.
Even if we restrict attention just to adults, the decline in cognitive abilities is rarely linear
non linear relationships : solution
Solution: use polynomials
testing non linear relationships
-non linear quadratic
-non linear cubic
Non-linear quadratic relationship can be modelled by entering the square of the predictor/independent variable.
Non-linear cubic relationship can be modelled by entering the cube of the predictor/independent variable.
Testing NON-LINEAR RELATIONSHIPS
Use hierarchical regression and:
1. enter the predictor variable first
2. then enter the square of the predictor variable, and examine whether there is a significant change in R2
for the model: if there is a significant change, then there is a significant non-linear (quadratic) component to
the relationship between the predictor and criterion
3. then add the cube: if there is a significant change in R2 then there is a significant non-linear (cubic)
component to the relationship between the predictor and criterion
If the square or the cube is significant, they should be kept in the model