Mod 13, Correlation/Regression Flashcards

1
Q

Regression:

A

logical extension of correlation; moves beyond describing the strength of the association by making more specific predictions based on that association (for any given value of x, we can predict the value of y)
Significant correlation tells us that there is a significant positive or negative relationship between two variables: REGRESSION allows us to predict one variable from another variable
After verifying that a correlation is significant, you can determine the equation of the line that best fits the data
Regression Line: “line of best fit” can be used to predict the value of y for a given value of x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Correlation Coefficient:

A

quantifies the strength and direction of an association between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Line of Best Fit

A

The more closely the dots are to the line of best fit: STRONGER THE RELATIONSHIP
Straight line drawn through the center of the data points that best expresses the association between the two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Conditions for Pearson Correlation

A

Random sampling
Continuous interval or ratio data
Normally distributed variables
No outliers
Linear association between variables (cannot detect non-linear associations)n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pearson coefficients and standardized comparisons

A

Pearson coefficients posses a common metric (standard deviation units), they allow us to compare the strengths of relationships with one another (STANDARDIZES IT LIKE A Z SCORE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Coefficient of Determination

A

how much of the variance in y we are actually able to explain, MORE VARIATION WE CAN ACTUALLY EXPLAIN: STRONGER ASSOCIATION BETWEEN VARIABLES (less third variable problems) dots are closer to the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Rank Correlation conditions/ monotonicity

A

Used when variables are not an interval or ratio measurement, or if their association is not linear
Used when Pearson’s cannot be used
Spearman’s Correlation: instead of using raw data, uses ranked values
CONDITIONS: random sampling, both variables must be at least ordinal, variables must increase monotonically with one another
Monotonicity: refers to whether or not one set of scores tends to increase or decrease alongside another set
Linear associations: ARE MONOTONIC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cohen’s d effect sizes

A

0.2= small
0.5=medium
0.8=large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Residuals

A

the difference between predicted values of y (dependent variable) and observed values of y .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Standard Error of Estimates

A

The standard deviation of the observed yi – values about the predicted y value for a given x value: COMMON MEASURE OF THE ACCURACY OF PREDICTIONS, the closer the observed y values are to the predicted y values, the smaller the standard error of estimate will be
WANT IT TO BE AS SMALL AS POSSIBLE = LESS ERROR IS ALWAYS BETTER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Difference between simple and multiple regression

A

Simple Regression: predicting values on one variable using info from one predictor variable
Multiple Regression: PREDICTING variables on an outcome variable from values on MORE than one predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Beta weights:

A

weighting of each of the factors; how much the outcome changes with one unit change in that predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Assumptions of Multiple Regressions

A

Linearity between each of the predictors and the outcome
Residuals are normally distributed about 0: points arent narrowing together from one another or widening apart from one another down the line
No extreme multicollinearity: can’t have predictors that are too highly correlated, we want our predictors to be explaining DIFFERENT aspects of our outcome, SHOULDNT BE ABOVE .8 OR BELOW -.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly