Correlation and Regression Flashcards

Question

How do we standardise the variables to deal with covariance problems such as measrements leading to different covariance outcomes?

Answer 1

We divide by the standard deviations of both variables. This is the correlation coefficient which is relatively unaffected by units of measurement

Answer 2

The correlation coefficient (Pearson)

Answer 3

Because the problem with covariance is units of measurement, because with raw scores covariance might be different. So you standardise to get around this, and standardising a variable means it is now a correlation . ADvantageous as Doing this to continuous variables fixes many things. Putting things on the SAME scale when you standardise which makes it easier to compare the two or more variables. You might also hear it called SCALING. It makes your variances equal. Then you’re not looking at COVARIANCE anymore as you have made the variances equal

Answer 4

A correlation. When you calculate this you get a correlation coefficient.

Answer 5

By subtracting the mean and dividing by the SD

Answer 6

Divide by the SD of X and Y

Answer 7

We force it to be between -1 and +1, about how staright line fits the data. So correlation coefficients can't be less than or more than +1. BUT, covariance can be anything!

Answer 8

False - covariance can be anything

Answer 9

For comparisons

Answer 10

It measures the DIRECTIOn and DEGREE of linear relationship between two interval/ratio variables. The + or - denoted the direction of the relationship, so whether positive or negative.

Answer 11

Correlation shows not only direction of linear relationship but STRENGTH. Covariance can't.

Answer 12

Continuous (interval or ratio)

Answer 13

If data fits the line better. Because then this tells us there is a strong relatinship, so a change in one variable is associated with a chane in another variable (not about what caused it) .

Answer 14

Because if we have a dataset with many variables you would have coefficients between each combination of those variables. Correlation of each pairwise combination of variables in whole dataset.

Answer 15

The more tests you run you increase chance of getting false positive. Also, people can run these tests not before hypothesis testing and try to come up with a justification

Answer 16

Because justifiying the fact without considering the research is not a correct way of scientific research.

Answer 17

Yes. For statistical inference

Answer 18

Spearman correlation

Answer 19

the sample size is less than 30, variables do not appear to be normally distributed

Answer 20

Monotonic relationship between variables. Therefore use spearman or try transforming the data

Answer 21

Parametric. Parametric tests always make assumptions about the distribution of the data being normal. Would always go for a parametric test as have more power.

Answer 22

Regression!

Answer 23

The association between TWO ORDINAL VARIABLES. X and Y both consist of ranks. Specifically, the degree of monotonic relationship between two variables, because assumption of linearity does not need o be met The consistency of the DIRECTION of the association between two intercal/ratio variables. Therefore, interval.ratio data must be converted to ranks before conducting spearkman correlation.

Answer 24

Converting data to ordinal means you lose variability and richness of the data.

Answer 25

The variables have a monotonic relationship or association which means relationships are consistently one directional, but not neccessarily linear. Because pearson correlation assume linearity, spearman assumes not that data points perfectly fit a line but that data is either consistently increasing or decreasing

Answer 26

That the data is monotonic.

Answer 27

As a variable increases, the other increases, but sometimes as variable decreases, the other decreases too (up and down)

Answer 28

What does shared variance mean and why would we want to calculate partial correlations?

Answer 29

Using the COEFFICIENT OF DETERMINATION (r2) Which is an effect size

Answer 30

Spearman's rho

Answer 31

By simply squaring r (the correlation coefficient)

Answer 32

the coefficient of determination (calculated by squaring the correlation coeffiicent). effect size It tells us how strongly the two variables are associated

Answer 33

BY r2 (coefficient of determination).

Answer 34

(.5) 2 (squared)

Answer 35

That 6% of the variability in cheese can be explained with variabiity in bread.

Answer 36

How much the VARIANCES of each variable overlap. That is what coefficient of determination is showing us.

Answer 37

NO. it is about being accounted for, not caused by.

Answer 38

C. Partial correlations investigated Because need to control for that third variable when looking at association between more than two variables

Answer 39

Because it measures the assocations between two variables, controlling for the effect that a third variables has on them both

Answer 40

The correlation between two variables when you hold constant the effects of a third variable on both of the other variables

Answer 41

Partial correlation because it allows us to control for exam anxiety

Answer 42

Where a partial correlation controls for the effect of a third variable on the correlations between two variables, a semi partial correlation controls for the effect a third variable has on ONE of the others.

Answer 43

It is used in multiple regression because if you square the semi-partial correlation, it tells you the variability in the outcome uniquely accounted for by one specific predictor variable

Answer 44

Semi-partial correlation

Answer 45

It tells us how much TOTAL variability in Y uniquely accounted for by ONE SPECIFIC PREDICTOR VARIABLE. It controls for the associations between predictor variables. So it takes into account X1 AND X2 associations in semi partial correlations whilst also giving unique effect for, say, X1 on Y.

Answer 46

Zero order is the correlation between two variables when you DO NOT Control for any other variables. So we do this in pearson and spearman correlations - just looking at relationship between two variables without controlling for anything.

Answer 47

Yes. 1st order correlation: partial correlation that controls for first variable 2nd order correlation: partial correlation that controls for TWO variables

Answer 48

The idea it is not possible to determine which variable is the cause and which is the effect. Research supports bidirectional effects across variables. Therefore correlations with cross sectional data are limited

Answer 49

A relationship established between two variables DOES NOT mean there is a DIRECT relationship as a third variable may be responsible for the relationship

Answer 50

A multiple regression with interactions because we are looking at whether an association between two variables differs by group.

Answer 51

True. Spearman's partial rank order correlation can be used :)

Answer 52

When for each correlation, we exclude participants who do not have a score for both variables. If there is more than 1 correlation reported, sample size may vary across different correlations

Answer 53

Across ALL correlations, exclude participants who do not have a score for every variable. The sample size WILL be the same for all reported correlations. It is not recommended to do listwise, so just removing partiicpant from entire dataset.

Answer 54

A way of modelling the association or relationship between the variables. We’re looking at modelling the data we have in a linear fashion. It is a model used to predict the value of one variable from another. It is a linear one Describing the relationship using the equation of a straight line

Answer 55

Predictor: X | Outcome/DV: Y

Answer 56

Y(i) = b0 + b1X1 + ei b0 = y intercept. Value of Y when x = 0. B1 - regression coefficient for the predictor - strength and direction of relationship e = error term. Difference between ACTUAL (data point) and PREDICTED value of Y for ith person.

Answer 57

The intercept, the slope for a predictor variable, and what the error is

Answer 58

Because smaller error terms mean the difference between actual scores and predicted scores are less, which mean the model is more accurately predicting scores. Less difference between ACTUAL scores and PREDICTED scores.

Answer 59

Because some are above and below the line and if didnt square, would cancel each other out. After squaring, sum to get sum of squares residual.

Answer 60

It uses calculus to determine the regression line that minimises the sum of square resisuals aka reduces least square residuals

Answer 61

False. A regression line DOES always pass through the mean of the predictor

Answer 62

Y(i) = b0 + b1X1 + ei

Answer 63

The regression coefficient unstandardised (B). No the intercept but the value next to the predictor variable. The unstandardised value (B) is the slope. If positive, positive slope. And it is saying as X increases by 1 unit change, then the Y would increase by (insert unstandardised variable which is the regression coefficient)

Answer 64

- The direction - Magnitutde - Whether a variable is a statistically significant predictor

Answer 65

B tells us similar information to b. However, B is communicating that for ONE UNIT CHANGE in predictor variable X, this will be the change in Y, b is saying: This is the expected STANDARD DEVIATION change in Y for a 1 SD change in X. So standardised beta is 1 SD change, and unstandardised B is a 1 unit change.

Answer 66

Look at B. Unstandardised. That tells us about the slope. Unstandardised is looking at unit changes, standardised is looking at SD changes

Answer 67

Beta. Standardised tells us about effect sizes. Unstandardised is looking at unit changes, standardised is looking at SD changes

Answer 68

Significance. Signficant predictor. Specifically, it indicates that our model explains an association between variables in a statistically significant way, meaning the two variables are related more than by chance.

Correlation and Regression Flashcards

(98 cards)