W4: RQ for Predictions 1 Flashcards

1
Q

What is a prediction

A

Using knowledge about one/more constructs to indicate people’s standing on another construct.

Used in the sense of indication (not explanation or cause)
e.g. a barometer predicts, but does not explain/cause the weather

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is in a good RQ involving prediciton

A
  1. ) Statement ending with ?
  2. ) Include all relevant constructs
  3. ) Indicate all relevant population
  4. ) Use predict as “driving word” (key)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

DV/IV in a prediction RQ. X/Y. Does the meaning and focus of RQ change if X and Y swapped?

A

DV: Being Predicted - Y Variable
IV: Predictor - X Variable.

it changes depending on which variable is defined on the DV/IV.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is “Variation”. How is it measured

A

Variation:

Total amount of variability in a distribution of scores from the mean.

Measured:

Sum of squared deviation scores (or Sum of Squares). Gets larger as n increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is “Variance”. How is it measured

A

Variance:

AVERAGE Sum of Squares in a distribution of scores (both population and samples)

Measured:

Expressed in a squared metric, relative to the scores on which it is calculated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is “Standard Deviation”. How is it measured

A

Standard Deviation:

Square Root of Variance (both population and samples)

Measured:

Expressed in same metric as scores on which it is calculated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the key distinguishing features between correlation and regression

A

Correlation: Symmetric Relationship
Regression: Asymmetric Relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a Symmetric Relationship. In terms of correlation; IV/DV; scatterplot

A

Variables have the SAME role and function in the characteristic of scores being summarised.

Cor (A,B) = Cor (B,A)
No IV/DV
Scatterplot: Variable on X/Y axis does not matter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Asymmetric Relationship. In terms of correlation; IV/DV; scatterplot

A

Variables have the DIFF role and function in the characteristic of scores being summarised.

Cor (A,B) /=/ Cor (B,A)
IV/DV declared priori
Scatterplot: Variable on X/Y axis fundementally important

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the conceptual formulation of a correlation

A

Sum of all the cross product of z-scores / df

> STANDARDIZED (conversion to z) measure of strength and direction of association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the conceptual formulation of a covariance

A

Sum of all the cross product of deviation scores/ df

> UNSTANDARDIZED (using deviation) measure of strength and direction of association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Can we calculate correlation from covariance, vice versa?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the line of best fit

A

Linear Regression Line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the slope value of a regression line: Formulation

A

Correlation (xy) x (SDy / SDx)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is bx (slope value of a regression line) and an estimate of

A

It is a sample statistic and also an estimate of the corresponding population parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can the slope value be interpreted as

A

For any 1 unit increase on the X variable, the value of Y variable increases (b) units

17
Q

What is the full regression equation

A

Yi = a + bXi + ei

Yi = observed scores on DV
Xi = observed scores on IV
ei = residual scores (difference between observed and predicted scores ON DV)
a = intercept
18
Q

What is the regression model equation

A

Y^i = a + bXi

Y^i = predicted scores on DV
Xi = observed scores on DV
a = intercept
b = regression coefficient (expected change in scores on DV for each unit change of IV)
19
Q

In a regression equation, what does a^ and b^ aim to do.

A

By using Ordinary Least Squares Estimator (OLS)

a^ and b^ aims to MINIMIZE the sum of squared residuals (i.e. Max strength of prediction)

20
Q

What is the difference between simple and multiple linear regression model

A

Simple:
- One intercept + One regression coefficient
Yi = a + bXi + ei

Multiple:
- One intercept + p partial independent variables (where p >= 2)

Yi = a + b1X1i + … + bpXpi + ei

21
Q

What is the aim in research using linear regression

A

Use sample regression estimates to make an inference about corresponding unknown population parameter values

22
Q

In R Studio, in a linear regression. Where the IV/DV and what are the properties

A

DV is always on the left. Must be numeric

IV is always on the right. Either numeric/factor

23
Q

Are the coefficient value for each value in simple regression the same as the coefficient value in the multiple regression. Explain.

A

Different.

In a multiple regression, the correlation AMONG IVs in their relationship to DV is partialled out/removed.
> Slope of each edge is an effect that is INDEPENDENT of the other DV

24
Q

What is the interpretation of the intercept in a regression model

A

Predicted value on the DV when people have a zero on all independent variable in the model

25
Q

What is the upper and lower bound in a 95% confidence interval

A

2.5% and 97.5%

26
Q

How do we interpret a 95% confidence interval of 0.10 and 0.86

A

We can be 95% confidence that the population coefficient value for the regression of ____ on ___ is between 0.10 and 0.86

27
Q

What is an unbiased 95% confidence interval

A

Over a large number of repeated samples drawn from the population, the confidence interval calculated in each sample will contain the true population parameter value 95% of the time on average

i.e. actual converge rate will be 95% over the long run

28
Q

What if the interval estimator is biased

A

Actual converge rate will be smaller/larger than the nominal rate OVER THE LONG RUN (e.g. 89%/98%)

29
Q

What if the interval estimator is consistent

A

Actual converge rate will get INCREASINGLY closer to 95% over the long run as sample size increases