W4: RQ for Predictions 1 Flashcards by andy Sitoh

What is a prediction

Using knowledge about one/more constructs to indicate people’s standing on another construct.

Used in the sense of indication (not explanation or cause)
e.g. a barometer predicts, but does not explain/cause the weather

How well did you know this?

Not at all

Perfectly

What is in a good RQ involving prediciton

) Statement ending with ?
) Include all relevant constructs
) Indicate all relevant population
) Use predict as “driving word” (key)

How well did you know this?

Not at all

Perfectly

DV/IV in a prediction RQ. X/Y. Does the meaning and focus of RQ change if X and Y swapped?

DV: Being Predicted - Y Variable
IV: Predictor - X Variable.

it changes depending on which variable is defined on the DV/IV.

How well did you know this?

Not at all

Perfectly

What is “Variation”. How is it measured

Variation:

Total amount of variability in a distribution of scores from the mean.

Measured:

Sum of squared deviation scores (or Sum of Squares). Gets larger as n increases

How well did you know this?

Not at all

Perfectly

What is “Variance”. How is it measured

Variance:

AVERAGE Sum of Squares in a distribution of scores (both population and samples)

Measured:

Expressed in a squared metric, relative to the scores on which it is calculated

How well did you know this?

Not at all

Perfectly

What is “Standard Deviation”. How is it measured

Standard Deviation:

Square Root of Variance (both population and samples)

Measured:

Expressed in same metric as scores on which it is calculated

How well did you know this?

Not at all

Perfectly

What is the key distinguishing features between correlation and regression

Correlation: Symmetric Relationship
Regression: Asymmetric Relationship

How well did you know this?

Not at all

Perfectly

What is a Symmetric Relationship. In terms of correlation; IV/DV; scatterplot

Variables have the SAME role and function in the characteristic of scores being summarised.

Cor (A,B) = Cor (B,A)
No IV/DV
Scatterplot: Variable on X/Y axis does not matter

How well did you know this?

Not at all

Perfectly

What is a Asymmetric Relationship. In terms of correlation; IV/DV; scatterplot

Variables have the DIFF role and function in the characteristic of scores being summarised.

Cor (A,B) /=/ Cor (B,A)
IV/DV declared priori
Scatterplot: Variable on X/Y axis fundementally important

How well did you know this?

Not at all

Perfectly

What is the conceptual formulation of a correlation

Sum of all the cross product of z-scores / df

> STANDARDIZED (conversion to z) measure of strength and direction of association

How well did you know this?

Not at all

Perfectly

What is the conceptual formulation of a covariance

Sum of all the cross product of deviation scores/ df

> UNSTANDARDIZED (using deviation) measure of strength and direction of association

How well did you know this?

Not at all

Perfectly

Can we calculate correlation from covariance, vice versa?

Yes

How well did you know this?

Not at all

Perfectly

What is the line of best fit

Linear Regression Line

How well did you know this?

Not at all

Perfectly

What is the slope value of a regression line: Formulation

Correlation (xy) x (SDy / SDx)

How well did you know this?

Not at all

Perfectly

What is bx (slope value of a regression line) and an estimate of

It is a sample statistic and also an estimate of the corresponding population parameter

How well did you know this?

Not at all

Perfectly

How can the slope value be interpreted as

Study These Flashcards

For any 1 unit increase on the X variable, the value of Y variable increases (b) units

What is the full regression equation

Study These Flashcards

Yi = a + bXi + ei

Yi = observed scores on DV
Xi = observed scores on IV
ei = residual scores (difference between observed and predicted scores ON DV)
a = intercept

What is the regression model equation

Study These Flashcards

Y^i = a + bXi

Y^i = predicted scores on DV
Xi = observed scores on DV
a = intercept
b = regression coefficient (expected change in scores on DV for each unit change of IV)

In a regression equation, what does a^ and b^ aim to do.

Study These Flashcards

By using Ordinary Least Squares Estimator (OLS)

a^ and b^ aims to MINIMIZE the sum of squared residuals (i.e. Max strength of prediction)

What is the difference between simple and multiple linear regression model

Study These Flashcards

Simple:
- One intercept + One regression coefficient
Yi = a + bXi + ei

Multiple:
- One intercept + p partial independent variables (where p >= 2)

Yi = a + b1X1i + … + bpXpi + ei

What is the aim in research using linear regression

Study These Flashcards

Use sample regression estimates to make an inference about corresponding unknown population parameter values

In R Studio, in a linear regression. Where the IV/DV and what are the properties

Study These Flashcards

DV is always on the left. Must be numeric

IV is always on the right. Either numeric/factor

Are the coefficient value for each value in simple regression the same as the coefficient value in the multiple regression. Explain.

Study These Flashcards

Different.

In a multiple regression, the correlation AMONG IVs in their relationship to DV is partialled out/removed.
> Slope of each edge is an effect that is INDEPENDENT of the other DV

What is the interpretation of the intercept in a regression model

Study These Flashcards

Predicted value on the DV when people have a zero on all independent variable in the model

What is the upper and lower bound in a 95% confidence interval

2.5% and 97.5%

How do we interpret a 95% confidence interval of 0.10 and 0.86

We can be 95% confidence that the population coefficient value for the regression of ____ on ___ is between 0.10 and 0.86

What is an unbiased 95% confidence interval

Over a large number of repeated samples drawn from the population, the confidence interval calculated in each sample will contain the true population parameter value 95% of the time on average i.e. actual converge rate will be 95% over the long run

What if the interval estimator is biased

Actual converge rate will be smaller/larger than the nominal rate OVER THE LONG RUN (e.g. 89%/98%)

What if the interval estimator is consistent

Actual converge rate will get INCREASINGLY closer to 95% over the long run as sample size increases

W4: RQ for Predictions 1 Flashcards

(29 cards)