W4: RQ for Predictions 1 Flashcards by andy Sitoh

Predictions: What is it, and what is the keyword? What does it not do?

Using knowledge about one/more constructs to indicate standing on another construct.

Indicate / Account for / Explain (if used descriptively)
Not cause
- Barometer indicates / account for /explain (if used descriptively), but it does not cause the weather

How well did you know this?

Not at all

Perfectly

What is in a good RQ involving prediciton

) Statement ending with ?
) Include all relevant constructs
) Indicate all relevant population
) Use predict as “driving word” (key)

How well did you know this?

Not at all

Perfectly

DV/IV in a prediction RQ. X/Y.

Does the meaning and focus of RQ change if X and Y swapped?

DV: Y-Axis
- Being Predicted (Predicted)
IV: X-Axis
- Doing the predicting (Predictor)
Meaning and focus of RQ change depending on which variable is defined on the DV/IV

How well did you know this?

Not at all

Perfectly

What is “Variation”. How is it measured

Variation

Total amount of variability in a distribution of scores from the mean
- Sum of squared deviation scores; or
- Sum of squares
  - Hence, gets larger as n increases

How well did you know this?

Not at all

Perfectly

What is “Variance”. How is it measured and expressed

Variance

Average Sum of Squares in a distribution of scores
Expressed in a squared metric, relative to the scores on which it is calculated
- Hence, it is independent of sample size

How well did you know this?

Not at all

Perfectly

What is “Standard Deviation”. How is it expressed

Standard Deviation

Square Root of Variance
Expressed in same metric as scores on which it is calculated

How well did you know this?

Not at all

Perfectly

What is the geometrical interpretation of deviations

SD
- Length
Variance
- Area (It is a square)
Variation
- Sum of all the Areas/Squares
- (That’s why it’s called sum of squares)

How well did you know this?

Not at all

Perfectly

What is the key distinguishing features between correlation and regression

Correlation
- Symmetric Relationship
Regression
- Asymmetric Relationship

How well did you know this?

Not at all

Perfectly

What is a Symmetric Relationship. In terms of correlation; IV/DV; scatterplot

Variables have the same role and function in the characteristic of scores being summarised.

Cor (A,B) = Cor (B,A)
No IV/DV
Scatterplot: Variable on X/Y axis does not matter

How well did you know this?

Not at all

Perfectly

What is a Asymmetric Relationship. In terms of correlation; IV/DV; scatterplot

Variables have the different role and function in the characteristic of scores being summarised.

Cor (A,B) /=/ Cor (B,A)
IV/DV declared a priori
Scatterplot: Variable on X/Y axis fundementally important

How well did you know this?

Not at all

Perfectly

What is the formula and conceptual formulation of a correlation

Correlation

Sum of cross products of deviation z-scores / (n-1)
r_xy= E (Z_xix Z_yi) / n - 1
- Standardised (z) measure of strength and direction of association
- (X and Y deviations in z will not be the same!)
We can look at the size of correlation to determine the strength of the association directly (-1 to 1)

How well did you know this?

Not at all

Perfectly

What is the conceptual formulation of a covariance

Covariance

Sum of cross product of deviation scores / (n-1)
S_xy = E [(X_i- μ_x ) x (Y_i- μy)] / (n-1)
- Unstandardised (using deviation) measure of strength and direction of association
- (X and Y deviations will not be the same!)
We cannot look at the size of covariance to determine the strength of the association directly (-∞ to +∞)

How well did you know this?

Not at all

Perfectly

Can we calculate correlation from covariance, vice versa? In what circumstances can we do that

Yes.

But we have to know the standard deviations of variable x and y

r_xy = S_xy/ sd_xx sd_y
S_xy= r_xyx sd_x x sd_y

How well did you know this?

Not at all

Perfectly

What alternative name of line of best fit. What does it do?

Linear Regression Line

Summarise relationship between 2 variables

How well did you know this?

Not at all

Perfectly

Formula to calculate the slope of the regression line

b_x= r_xyx (SD_y / SD_x)

Sample correlation x Respective standard deviations.
- Know value of sample correlation
- Know respective standard deviations
Therefore, slope will be the same if SD is the same, which is not very possible.

Alternatively,

b = (Y₂- Y₁ ) / (X₂ - X₁)

How well did you know this?

Not at all

Perfectly

What is b_x ? An SS or a PP?

b_x
- Sample statistic
- An estimate of the population parameter ρ_x
  - Don’t forget, population parameter can never be “Calculated”

How can the slope value interpreted

For any 1 unit increase on the X variable
The value of Y variable increases ____ units

What is the full regression equation

Y_i = a + bX_i+ e_i
- Y_i
  - Observed scores on DV
- Xi
  - Observed scores on IV
- e_i
  - Residual scores
  - Difference between observed and predicted scores on DV
- a
  - Intercept

What is the regression model equation

Y^_i = a + bX_i

Y^i
- Predicted scores on DV
X_i
- Observed scores on DV
a
- Intercept
b
- Regression coefficient
- Expected change in scores on DV for each unit change of IV

What is the SS in linear regression?

SS_total=

SS_reg
- Derived from SS value from Linear Regression Line to Mean
- I.e. variation explained by linear regression model
SS_res
- Derived from SS value from Linear Regression Line to observed scores
- i.e. variation not explained by linear regression model

In a regression equation, what does a^ and b^ aim to do. What method is this. Is it biased?

Ordinary least squares estimator

Find values of a^ and b^ to minimise the sum of squared residuals
- Minimise the difference between obsered and predicted values of DV
- Therefore, Maximize strength of prediction
OLS is unbiased :)

What is the difference between simple and multiple linear regression model

Simple

One intercept and One regression coefficient
Y_i = a + bX_i + e_i

Multiple

One intercept and p partial regression coefficients (where p >= 2)
Yi = a + b₁X_1i + … + b_pX_pi + e_i

What is the aim in research using linear regression

Use sample regression estimates to make an inference about corresponding unknown population parameter values

Are the coefficient value for each value in simple regression the same as the coefficient value in the multiple regression. Explain.

Different

Multiple regression
- Correlation among IVs in their relationship to DV and with each other is partialled out/removed.
  - i.e. if there is no correlations among IVs, simple regression = multiple regression
- Slope of each edge (partial regression coefficient) is an effect that indepedent of other IVs

What is the interpretation of the intercept in a regression model

Predicted value on the DV when people have a zero on all independent variable in the model

What is the upper and lower bound in a 95% confidence interval

2.5% and 97.5%

How do we interpret a 95% confidence interval of 0.10 and 0.86 in a regression?

We can be 95% confident that the population coefficient value for the regression of ____ on ___ is between 0.10 and 0.86

What is an unbiased 95% interval estimator. What is it NOT

* Over large number of repeated samples drawn from the population, CIs calculated in each sample will contain the true population parameter value 95% of the time on average * (i.e. actual converge rate will be 95% over the long run) NOT 95% chance the population parameter will be captured in an interval

What if the interval estimator is biased

Actual converge rate will be smaller/larger than the nominal rate over the long run (e.g. 89%/98%)

What if the interval estimator is consistent

Actual converge rate will get increasingly closer to 95% over the long run as sample size increases. (If it is 98% and goes to 99% as n increases, it is still not consistent!)