Multiple linear regression Flashcards

Question

# Definition the process of coding a categorical variable into dichotomous variables

Answer 1

Dummy coding

Answer 2

Semi-partial (part) correlation

Answer 3

-1.584 indicates that a 1 unit change in self-esteem is predicted to be associated with a 1.584 decrease in loneliness

Answer 4

Coefficients depend on the scale of the predictors and the DVs If self-esteem ranges from 0 to 2, a 1 unit change is a huge difference; if the range is 0 to 100, a 1 unit change is very small

Answer 5

In the current regression, β = -.257, meaning that a 1 SD unit increase in self-esteem is expected to predict a .257 decrease in loneliness

Answer 6

R² = MEV/Total variance Model explained variance (MEV) = 5 - 4 = 1 Total variance = 5 R² = 1 / 5 = 0.2 = 20%

Answer 7

There are no fixed rules for what a small, medium, or large R2 is, but many people consider .04 (4%) as small, .09 (9%) as medium, and .25 (25%) as large

Answer 8

Gives an estimate of R² in the population. It takes into account sample size, the number of predictors, etc…

Answer 9

Sampling error increases as sample size **decreases** and as the number of predictors **increase**

Answer 10

Sampling error increases as sample size decreases and as the number of predictors increase

Answer 11

Shrinkage is best evaluated using a **cross validation study**

Answer 12

it indicates the regression model does not generalise well to the population

Answer 13

Most useful when using a variant for multiple regression

Answer 14

Difference between these two is self-esteem unique variance (15% - 10% = 5%) f2 for the unique effect of self-esteem would be: 𝑓2 = (.15 − .10) / (1 − .15) = .05 / .85 = .059

Answer 15

* Outcome variable must be continuous; predictors can be continuous or dichotomous * Predictors must not have zero variance * Independence * Linearity * Independent errors * Normally-distributed errors * Homoscedasticity

Answer 16

Partial regression plot (i.e. residual and predicted) The absence of clear pattern, means assumption has been met

Answer 17

For any 2 observations, the errors or residual terms should be independent (i.e., uncorrelated) with one another Observations should be independent and there should not be any systematic relationship amongst the residuals

Answer 18

Residuals for the regression model should be random and normally distributed, with mean = 0 Note: this does not mean the predictors have to be normally distributed – predictors do not need to be normally distributed (although, it does improve the chances of this assumption being met)

Answer 19

Homoscedasticity

Answer 20

1. How you assessed the assumptions and whether they appear violated 2. Any attempts to address violated assumptions 3. Whether these attempts resolves assumption issues or if they were still violated 4. Descriptive statistics (e.g. mean and SD/median and IQR/frequency and percent) 5. Standardised coefficient, effect size, R², confidence intervals and p-value

Answer 21

Add b_kX_k, where k = number of predictor variable included In multiple regression b_k captures the unique relationship, conditional (adjusted) for all other predictors in the model

Answer 22

The β value with the largest magnitude (ignoring the +/- signs) is the best predictor Thus, self-esteem is the best predictor of loneliness – indeed, it is the only significant predictor

Answer 23

R Square (R2 ) tells you the proportion of variability in the outcome variable that is accounted for by the predictor variables In our example, 7% of the variability in loneliness is accounted for by self-esteem and number of exercise days (when considered together)

Answer 24

**Tolerance:** predictor variances with tolerances \< .10 are multicollinear with 1+ other predictors, which is concerning (i.e., you have a multicollinearity problem) **VIF:** VIF = 1/tolerance; predictor variables with VIF \> 10 are multicollinear with 1+ other predictors, which is concerning (i.e., you have a multicollinearity problem)

Answer 25

Multicollinearity

Answer 26

Consider deleting one of the offending variables 2 Combine the variables with high intercorrelations into a single measure

Answer 27

The variability in loneliness uniquely accounted for by self-esteem = (-.261)2 \* 100 = 6.81% The variability in loneliness uniquely accounted for by exercise = (.058)2 \* 100 = 0.34%

Answer 28

The overall significance of the regression equation can be evaluated by computing an **F-ratio**

Answer 29

A significant F-ration indicates that the equation predicts a significant proportion of the variability in the Y scores (i.e., more than would be expected by chance alone)