Choosing Regression Models Flashcards
1
Q
SELECTING REGRESSORS
A
- potentially useful regressors have characteristics:
1. to have “statsig effect” on DV
2. able to discriminate between values of DV (categorical IV)
3. to be strongly associated w/DV (continuous IV)
2
Q
CATEGORICAL VARIABLES: DV PREDICTION EVALUATION
A
- dichotomous variable (ie. gender)
- compare using t-test/ANOVA
- if statsig -> has discriminatory value
- can explain/predict variation in DV
3
Q
CONTINUOUS VARIABLES: DV PREDICTION EVALUATION
A
- continuous variable (ie. height)
- compare using correlation test
- if statsig -> discriminatory value
- can explain/predict variation in DV
4
Q
UNIVARIATE ANALYSES
A
- 1 IV
- one-way ANOVA
- simple linear regression
5
Q
MULTIVARIATE ANALYSES
A
- 1+ IV
- 2-way ANOVA
- multiple linear regression
- some regressor discriminatory value may be accounted for by regressors already present in model (ie. gender/income/height/age)
- adding regressor may not add as much to predictive value as anticipated
- impact of individual regressors can only be truly assessed “in presence of all other regressors”
6
Q
“IN THE PRESENCE OF ALL OTHER REGRESSORS”
A
- observed effect of any individual regressor in multiple regression model = only accurate:
1. in presence of other specific regressors also in model
2. for sample on which values are based - if same regressor is entered into dif multiple regression model:
1. coefficient values/direction likely to change
2. statsig likely to change
7
Q
EFFECTIVENESS (VS EFFICIENCY)
A
- highest R^2 (most complete)
- will have more regressors
- will be effective BUT not efficient
8
Q
EFFICIENCY (VS EFFECTIVENESS)
A
- highest f-ratio (aka. Most statsig)
- will have single most important regressor
- will be efficient BUT not particularly effective
9
Q
EFFECTIVENESS VS EFFICIENCY: COMPROMISE
A
- will contain only “best” regressors available
- manageable number of regressors
- reasonably effective
10
Q
BEST MODEL FOR SAME REGRESSOR NUMBER
A
- Choose model w/highest R^2ADJ value
- Gives “best value” p/regressor
- Will also have acceptably high R^2/F-ratio value
11
Q
BEST MODEL FOR DIF REGRESSOR NUMBER
A
- Choose model w/2nd highest R^2ADJ value
- Best compromise between effectiveness/efficiency
- R^2 value reasonably high (effective; large % variance in DV explained)
- Reasonably high F-ratio value (efficient; only useful regressors included)