Associations Between Two Continuous Variables Flashcards

1
Q

Sometimes we are interested in testing if two continuous variables are associated with one another. What is the most common form of association studied?

A

Linear association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Positive association

A

When people score high (or low) on the first variable also score high (or low) on the second variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Negative association

A

When people score high on the first variable and low on the second (or vice versa)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the most common index of a linear association?

A

Pearson correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sum of the products of deviation (SP)

A

Reflects the co-variability (shared variation) of X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What produces big positive SP values?

A

Lots of above/above pairs (two numbers above mean)

AND

Lots of below/below pairs (two numbers below mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What produces big negative SP values?

A

Lots of above/below pairs and lots of below/above pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What produces near 0 SP values?

A

Equal mix of above/above, below/below, above/below, and below/above pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

r squared is referred to as the…

A

coefficient of determination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

r squared reflects…

A

the proportion of variance that our predictor variable accounts for in our outcome variable (variability explained by linear regression)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 factors influencing the size of r:

A

Distribution of variables
- Perfect correlations only exist if shape of distributions are exactly the same (positive) or exactly opposite (negative)

Reliability of measures
- Perfect correlations only exist with perfect reliability in both measures

Restriction of range
- Restricting the range of scores on one or both variables can weaken correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Regression analysis using a single predictor variable is referred to as…

A

“simple regression”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Regression analysis involving two or more predictors is referred to as…

A

“multiple regression”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When two variables are linearly associated, this association can be described using a simple equation:

A

Y = bX + a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do each of the variables in the regression equation (Y = bX + a) represent?

A

Y - represents scores on the outcome variable
b - represents slope of best fitting line
X - represents scores on the predictor variable
a - fixed constant representing the Y intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Standard error of estimate

A

A measure of the standard distance between a regression line and the actual data points

Basically how much error variance is in our model

17
Q

How is SS error related to r?

A

As r approaches 1, SS error will become smaller

As r approaches 0, SS error will become larger

18
Q

What is the null and alternative hypotheses of the b value for a simple regression?

A

H0: B = 0 (there is no linear association between X and Y -> slope is not significantly different from 0)

H1: B (does NOT equal) 0

19
Q

To test the null of a simple regression we partition the variance in Y (DV) into two components:

A
  1. Variability in Y predicted from linear association

2. Variability in Y predicted by error variability

20
Q

What are the 4 assumptions that simple regression (and its NHST) is based on?

A
  1. Independence of observations
  2. Linear relationship between X and Y
  3. Residuals are normally distributed with a mean of 0
  4. Homoscedasticity of residuals
21
Q

What makes regression more so like t-tests and less so like ANOVA?

A

No real follow-up tests are relevant because there is nothing to interpret, we are just looking for raw data.

Regression is not an omnibus test.

22
Q

Total squared error is also known as…

A

sum of squared error (SS error)

23
Q

Sum of squares (SS)

A

Sum of the squared deviations
A higher SS value indicates a large degree of variability
A lower SS value indicates data does not vary considerably from mean value

24
Q

Regression degrees of freedom equals…

25
Q

Error degrees of freedom equals…

26
Q

Anytime an SS value is divided by its df value it is…

A

an index of variance

27
Q

The Pearson correlation coefficient (r) is…

A

an index of association that assesses the magnitude and direction of linear relation between two variables

AND

an index of co-variability of X and Y relative to the variability of X and Y separately

28
Q

z-score represents…

A

an individual score’s standing within the distribution for that score

Basically, a score of 1 is 1 standard deviation about the mean

29
Q

3 special cases of Pearson correlation

A

Point biserial correlation
- Correlation between dichotomous variable and continuous variable

Phi coefficient
- Correlation between two dichotomous variables

Spearman rank-order correlation
- Correlation between ordinal variables

30
Q

Why put r into z scores? (think… r formula using z scores)

A

Because it allows us to standardize and compare r across different studies. By dividing by sample size we are also standardizing because sample size differs across different studies

31
Q

When we standardize both X and Y (in z form) they equal zero. Thus, the variability (SS) in each of them…

A

have to be equivalent (SSY = SSX)

32
Q

What is the difference between the homogeneity of variance assumption and the homoscedasicity of residuals assumption?

A

“Homogeneity of variance” is used in the ANOVA context

“Homoscedasicity” is used in the regression context

Both assume that the variance in residuals is the same everywhere