W4: RQ for Predictions 1 Flashcards

1
Q

Predictions: What is it, and what is the keyword? What does it not do?

A

Using knowledge about one/more constructs to indicate standing on another construct.

  • Indicate / Account for / Explain (if used descriptively)
  • Not cause
    • Barometer indicates / account for /explain (if used descriptively), but it does not cause the weather
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is in a good RQ involving prediciton

A
  1. ) Statement ending with ?
  2. ) Include all relevant constructs
  3. ) Indicate all relevant population
  4. ) Use predict as “driving word” (key)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

DV/IV in a prediction RQ. X/Y.

Does the meaning and focus of RQ change if X and Y swapped?

A
  • DV: Y-Axis
    • Being Predicted (Predicted)
  • IV: X-Axis
    • Doing the predicting (Predictor)
  • Meaning and focus of RQ change depending on which variable is defined on the DV/IV
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is “Variation”. How is it measured

A

Variation

  • Total amount of variability in a distribution of scores from the mean
    • Sum of squared deviation scores; or
    • Sum of squares
      • Hence, gets larger as n increases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is “Variance”. How is it measured and expressed

A

Variance

  • Average Sum of Squares in a distribution of scores
  • Expressed in a squared metric, relative to the scores on which it is calculated
    • Hence, it is independent of sample size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is “Standard Deviation”. How is it expressed

A

Standard Deviation

  • Square Root of Variance
  • Expressed in same metric as scores on which it is calculated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the geometrical interpretation of deviations

A
  • SD
    • Length
  • Variance
    • Area (It is a square)
  • Variation
    • Sum of all the Areas/Squares
    • (That’s why it’s called sum of squares)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the key distinguishing features between correlation and regression

A
  • Correlation
    • Symmetric Relationship
  • Regression
    • Asymmetric Relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Symmetric Relationship. In terms of correlation; IV/DV; scatterplot

A

Variables have the same role and function in the characteristic of scores being summarised.

  • Cor (A,B) = Cor (B,A)
  • No IV/DV
  • Scatterplot: Variable on X/Y axis does not matter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a Asymmetric Relationship. In terms of correlation; IV/DV; scatterplot

A

Variables have the different role and function in the characteristic of scores being summarised.

  • Cor (A,B) /=/ Cor (B,A)
  • IV/DV declared a priori
  • Scatterplot: Variable on X/Y axis fundementally important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula and conceptual formulation of a correlation

A

Correlation

  • Sum of cross products of deviation z-scores / (n-1)
  • rxy = E (Zxi x Zyi) / n - 1
    • Standardised (z) measure of strength and direction of association
    • (X and Y deviations in z will not be the same!)
  • We can look at the size of correlation to determine the strength of the association directly (-1 to 1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the conceptual formulation of a covariance

A

Covariance

  • Sum of cross product of deviation scores / (n-1)
  • Sxy = E [(Xi - μx ) x (Yi - μy)] / (n-1)
    • Unstandardised (using deviation) measure of strength and direction of association
    • (X and Y deviations will not be the same!)
  • We cannot look at the size of covariance to determine the strength of the association directly (-∞ to +∞)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Can we calculate correlation from covariance, vice versa? In what circumstances can we do that

A

Yes.

But we have to know the standard deviations of variable x and y

  • rxy = Sxy / sdx x sdy
  • Sxy = rxy x sdx x sdy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What alternative name of line of best fit. What does it do?

A

Linear Regression Line

  • Summarise relationship between 2 variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Formula to calculate the slope of the regression line

A

bx = rxy x (SDy / SDx)

  • Sample correlation x Respective standard deviations.
    • Know value of sample correlation
    • Know respective standard deviations
  • Therefore, slope will be the same if SD is the same, which is not very possible.

Alternatively,

b = (Y2 - Y1 ) / (X2 - X1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is bx ? An SS or a PP?

A
  • bx
    • Sample statistic
    • An estimate of the population parameter ρx
      • Don’t forget, population parameter can never be “Calculated”
17
Q

How can the slope value interpreted

A
  • For any 1 unit increase on the X variable
  • The value of Y variable increases ____ units
18
Q

What is the full regression equation

A
  • Yi = a + bXi + ei
    • Yi
      • Observed scores on DV
    • Xi
      • Observed scores on IV
    • ei
      • Residual scores
      • Difference between observed and predicted scores on DV
    • a
      • Intercept
19
Q

What is the regression model equation

A

Y^i = a + bXi

  • Y^i
    • Predicted scores on DV
  • Xi
    • Observed scores on DV
  • a
    • Intercept
  • b
    • Regression coefficient
    • Expected change in scores on DV for each unit change of IV
20
Q

What is the SS in linear regression?

A

SStotal =

  • SSreg
    • Derived from SS value from Linear Regression Line to Mean
    • I.e. variation explained by linear regression model
  • SSres
    • Derived from SS value from Linear Regression Line to observed scores
    • i.e. variation not explained by linear regression model
21
Q

In a regression equation, what does a^ and b^ aim to do. What method is this. Is it biased?

A

Ordinary least squares estimator

  • Find values of a^ and b^ to minimise the sum of squared residuals
    • Minimise the difference between obsered and predicted values of DV
    • Therefore, Maximize strength of prediction
  • OLS is unbiased :)
22
Q

What is the difference between simple and multiple linear regression model

A

Simple

  • One intercept and One regression coefficient
  • Yi = a + bXi + ei

Multiple

  • One intercept and p partial regression coefficients (where p >= 2)
  • Yi = a + b1X1i + … + bpXpi + ei
23
Q

What is the aim in research using linear regression

A

Use sample regression estimates to make an inference about corresponding unknown population parameter values

24
Q

Are the coefficient value for each value in simple regression the same as the coefficient value in the multiple regression. Explain.

A

Different

  • Multiple regression
    • Correlation among IVs in their relationship to DV and with each other is partialled out/removed.
      • i.e. if there is no correlations among IVs, simple regression = multiple regression
    • Slope of each edge (partial regression coefficient) is an effect that indepedent of other IVs
25
Q

What is the interpretation of the intercept in a regression model

A

Predicted value on the DV when people have a zero on all independent variable in the model

26
Q

What is the upper and lower bound in a 95% confidence interval

A

2.5% and 97.5%

27
Q

How do we interpret a 95% confidence interval of 0.10 and 0.86 in a regression?

A

We can be 95% confident that the population coefficient value for the regression of ____ on ___ is between 0.10 and 0.86

28
Q

What is an unbiased 95% interval estimator. What is it NOT

A
  • Over large number of repeated samples drawn from the population, CIs calculated in each sample will contain the true population parameter value 95% of the time on average
  • (i.e. actual converge rate will be 95% over the long run)

NOT 95% chance the population parameter will be captured in an interval

29
Q

What if the interval estimator is biased

A

Actual converge rate will be smaller/larger than the nominal rate over the long run (e.g. 89%/98%)

30
Q

What if the interval estimator is consistent

A

Actual converge rate will get increasingly closer to 95% over the long run as sample size increases.

(If it is 98% and goes to 99% as n increases, it is still not consistent!)