W4/5: Practice Questions for Predictions Flashcards
What kind(s) of analysis immediately comes to mind when we talk of association among variables? Give some examples
1.) Symmetric form of relationship - All variables same functional role and form
Continuous - Correlation
Categorical - Contingency Table
What kind(s) of analysis immediately comes to mind when we talk of prediction of one variable by other variables
Prediction, Linear regression which involves prediction of scores on a continuous dependent variable by one or more IV (either continuous/categorical)
If we have a scatterplot between two variables and a regression line is placed on the graph, where will we find the predicted values from the regression of Y on X?
On regression line.
What is the distance between the observed and predicted Y values called in such a graph (regression line). What is it signified by
Residual. Ei
Why is a (1) covariance/correlation and (2)contingency table said to reflect a symmetric relationship among variables.
Same functional role and form.
(1) Correlation: Does not depend on specification
(2) Contingency Table: Does not depend on specification, can form row/column.
Why is linear regression said to reflect an asymmetric relationship among variables
Not all variables have same functional form and role.
What is the residual term in linear regression equal to
Residual = Observed score on DV - Predicted score on DV
What does the total sum of squares of a dependent variable get decomposed into in a linear regression model
SStotal = SSreg (Explained/accounted for by the regression model) + SSres (Not explained/accounted for by the regression model)
What is the difference between a simple regression model and a multiple regression model
Simple: 1 IV
Multiple: >1 IV
List three advantages in using a confidence interval on R2
- ) CI can indicate whether data is consistent with no prediction/prediction at a population level
- ) CI width can indicate precision of interval estimation of R^2
3) Lower bound being close (but not) 0 indicate regression model may explain trivial amount of variation in DV
Why is a regression coefficient in a multiple regression analysis referred to as being a partial regression coefficient
Partial regression coefficient value indicate expected change in DV for focal IV when all other IVs due to their joint correlation has been partialed out.
How does the interpretation of a standardised partial regression coefficient differ from that of its corresponding unstandardised partial regression coefficient
Standardized: In terms of SD units. Can be compared to on common metric.
Unstandardized: In terms of raw score units. Cannot be directly compared in size because size depends on metric of IV to which it is attached.
Why would you expect the partial regression coefficients for a set of IVs in a multiple regression analysis to differ from the value of the regression coefficients obtained when each IV is used separately in a set of simple regression analyses?
IVs correlate with other IVs and DV.
Partial regression coefficient removes effect of each IV where overlap with other IVs in predicting DV has been partialled out (Using least squares estimator)
Define a 95% confidence interval that is placed around a sample R2 value, which is analogous to the interpretation of the sample R2 itself
95% confident that POPULATION R^2 value will lie between lower bound and upper bound.
Define a 95% confidence interval that is placed around a partial regression coefficient that is analogous to the interpretation of the coefficient itself
95% confident that 1 unit increase in focal IV will result in expected change in scores of DV ranging between lower bound and upper bound, keeping constant scores on other IVs.