correlation and regression Flashcards

1
Q

what may be related to each other? - give an example

A
  • two datasets may be related
    e.g., height, weight
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

when can you see the relationship of the datasets?

A
  • when you look at them on a graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what were the first statistics invented for?

A
  • for analysing co- relationships
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

when is there probably a mistake in data?

A
  • if your data shows a perfect straight line
  • if there’s more than one datapoints a long way away from all the others
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when might data be worth checking for mistakes?

A
  • if there’s no relationship at all between things you really expect to be related
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the definition of correlation?

A
  • finds the best fit line by minimising the difference between the data and line
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does a correlation report about a relationship?

A
  • strength and direction of a relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a residual?

A
  • difference between an observed value and a predicted value in regression analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is a zero correlation?

A
  • no relationship between the variables
  • cluster of data points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is a positive correlation?

A
  • relationship between two variables that tend to move in the same direction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a negative correlation

A
  • two individual variables generally move in opposite directions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what would you do to get the line of best fit?

A
  • could try adjust the line manually but wouldn’t be the best fit
  • need to use maths instead
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what equation allows you to work out the line of best fit?

A

r = Sxy/ Sx.Sy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does Sxy stand for?

A
  • how much x and y change together
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is Sx. Sy?

A
  • how much x and y change separately
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the equation to work out r?

A

n/i = 1 (xi-x)(yi-y) / square root of n/i= 1 (xi-x)^2 square root of n/i= 1 (yi- y) ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what two aspects does a R value tell you?

A
  • direction
  • strength
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what value is r when the correlation is positive?

A
  • if r is above 0
    1 > r > 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what value is r when the correlation is negative?

A
  • r is below zero
    -1 < r < 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is the value of r when the correlation is strong?

A
  • if r is close to one
    r +/- 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is the r value when the correlation is weak?

A
  • r is close to zero
    r- 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

when are r values especially useful?

A
  • useful for values in the middle e.g., - 0.4 to 0.4
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what does the r-squared value tell you?

A
  • how much of the variance is explained by your correlation
24
Q

what is the r- squared value when correlation explains a lot of variance?

A
  • if r2 is close to one
    r2-1
25
Q

what is the r- squared value when correlation explains only a little variance?

A
  • if r2 is close to zero
    r2- 0
26
Q

what other name is r-squared given?

A
  • coefficient of determination
27
Q

what is 1-r2?

A
  • amount of variance not explained
  • random noise
28
Q

what is regression?

A
  • gives your the strengths, directions and equations of relationships
29
Q

what is the regression equation?

A

y = mx + c

30
Q

what is m in the equation?

A
  • slope
31
Q

what is c in the equation?

A
  • intercept
32
Q

what happens when x= 0 ?

A

y= intercept

33
Q

what happens to y-axis when x increases by 1?

A
  • y increases by the slope
34
Q

what do both correlation and regression involve?

A
  • both involve linear relationships between one or more input (predictor) variables and a single output (outcome) variable
35
Q

what data can both correlation and regression deal with?

A
  • categorical, ordinal, and non- linear predictors
36
Q

what relationship does correlation describe?

A
  • single relationship
37
Q

what relationship does regression describe?

A
  • multiple relationships
38
Q

what is the difference between X and Y in correlation compared to regression?

A
  • in correlation, X and Y are inter- changeable whereas X and Y are not inter- changeable in regression
39
Q

do correlation and regression allow prediction?

A
  • correlation doesn’t allow prediction
  • regression allows prediction
40
Q

what symbols are used for correlations?

A
  • R and r2
41
Q

what symbols are used for regression?

A
  • R
  • R2
  • F
  • t
  • SE
  • B1-n
42
Q

what does jamovi allow us to explore? what do you use?

A
  • multiple relationships in one go
  • use a correlation matrix
43
Q

what does correlation matrix include? what do we calculate?

A
  • includes all information we need but we must calculate df ourselves
44
Q

how do you calculate df?

A

df = n - 2

45
Q

how do you calculate correlations?

A

r([df])= [Pearson’s r], p = [p-value]

46
Q

what is overall regression?

A
  • r2= [(r2) value]
47
Q

what is the model fit of regression?

A

F ([df1], [df2]) = [F-value], p= [p-value]

48
Q

what is multiple linear regression?

A
  • single outcome variable (y) but multiple predictor variables (x1, x2)
49
Q

what do you find in multiple linear regression?

A
  • find the best- fitting surface
50
Q

where are residuals in multiple linear regression?

A
  • residuals are distance from the surface
51
Q

what can the predictors be in multiple linear regression?

A
  • predictors can be almost anything:
    continuous, ordinal, discrete
    normally- distributed or not
    linear or non- linear
52
Q

what is multiple linear regression said to be?

A
  • flexible
    e.g., ChatGPT, fMRI, COVID, elections
53
Q

what does each predictor result in?

A
  • result in an estimate, a standard error, a t- score and a p- value
54
Q

what is the problem with correlation and regression?

A
  • extrapolation
55
Q

what does non- linear relationships cause?

A
  • causes problems
56
Q

what are the solutions to the problems?

A
  • look at the data
  • check for mistakes
  • perhaps transform the data: quadratic, cubic, logarithmic
57
Q

does correlation equal causation?

A
  • no