Mod 13, Correlation/Regression Flashcards

Question 1

Q

Regression:

Answer

A

logical extension of correlation; moves beyond describing the strength of the association by making more specific predictions based on that association (for any given value of x, we can predict the value of y)
Significant correlation tells us that there is a significant positive or negative relationship between two variables: REGRESSION allows us to predict one variable from another variable
After verifying that a correlation is significant, you can determine the equation of the line that best fits the data
Regression Line: “line of best fit” can be used to predict the value of y for a given value of x

Question 2

Q

Correlation Coefficient:

Answer

A

quantifies the strength and direction of an association between two variables

Question 3

Q

Line of Best Fit

Answer

A

The more closely the dots are to the line of best fit: STRONGER THE RELATIONSHIP
Straight line drawn through the center of the data points that best expresses the association between the two variables

Question 4

Q

Conditions for Pearson Correlation

Answer

A

Random sampling
Continuous interval or ratio data
Normally distributed variables
No outliers
Linear association between variables (cannot detect non-linear associations)n

Question 5

Q

Pearson coefficients and standardized comparisons

Answer

A

Pearson coefficients posses a common metric (standard deviation units), they allow us to compare the strengths of relationships with one another (STANDARDIZES IT LIKE A Z SCORE)

Question 6

Q

Coefficient of Determination

Answer

A

how much of the variance in y we are actually able to explain, MORE VARIATION WE CAN ACTUALLY EXPLAIN: STRONGER ASSOCIATION BETWEEN VARIABLES (less third variable problems) dots are closer to the line

Question 7

Q

Rank Correlation conditions/ monotonicity

Answer

A

Used when variables are not an interval or ratio measurement, or if their association is not linear
Used when Pearson’s cannot be used
Spearman’s Correlation: instead of using raw data, uses ranked values
CONDITIONS: random sampling, both variables must be at least ordinal, variables must increase monotonically with one another
Monotonicity: refers to whether or not one set of scores tends to increase or decrease alongside another set
Linear associations: ARE MONOTONIC

Question 8

Q

Cohen’s d effect sizes

Answer

A

0.2= small
0.5=medium
0.8=large

Question 9

Q

Residuals

Answer

A

the difference between predicted values of y (dependent variable) and observed values of y .

Question 10

Q

Standard Error of Estimates

Answer

A

The standard deviation of the observed yi – values about the predicted y value for a given x value: COMMON MEASURE OF THE ACCURACY OF PREDICTIONS, the closer the observed y values are to the predicted y values, the smaller the standard error of estimate will be
WANT IT TO BE AS SMALL AS POSSIBLE = LESS ERROR IS ALWAYS BETTER

Question 11

Q

Difference between simple and multiple regression

Answer

A

Simple Regression: predicting values on one variable using info from one predictor variable
Multiple Regression: PREDICTING variables on an outcome variable from values on MORE than one predictor variable

Question 12

Q

Beta weights:

Answer

A

weighting of each of the factors; how much the outcome changes with one unit change in that predictor

Question 13

Q

Assumptions of Multiple Regressions

Answer

A

Linearity between each of the predictors and the outcome
Residuals are normally distributed about 0: points arent narrowing together from one another or widening apart from one another down the line
No extreme multicollinearity: can’t have predictors that are too highly correlated, we want our predictors to be explaining DIFFERENT aspects of our outcome, SHOULDNT BE ABOVE .8 OR BELOW -.8

Mod 13, Correlation/Regression Flashcards

(13 cards)