Association Flashcards

1
Q

What is the Pearson Correlation coefficient?

A
  • Measure of linear correlation between two numeric variables.
  • It is a measure of how well the data fit a straight line.
  • The value lies between 1 and -1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the different pearson correlation coefficient values that can be obtained and what they mean

A
  • r > 0 we have a positive correlation; implying that if one variable increases then so does the other.
  • r
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When should the pearson correlation coefficient not be used?

A
  • There is a non-linear relationship between variables
  • There are outliers
  • There are distinct sub-groups (if we mix two samples together such as healthy controls and disease cases)
  • One or both of the variables is not normally distributed.
  • One or both of the variables is non-numeric.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When should the Pearson correlation coefficient be calculated/used? What is the alternative?

A
  • Only calculated between two normally distributed variables

- Spearman rank correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When can Spearman’s rank correlation coefficient be calculated/used?

A
  • When the data is not normally distributed
  • When one or both of the variables are ordinal
  • When the sample size is small
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 4 possibilities if two variables correlate?

A
  • The result occurred by chance
  • A influences (or ‘causes’) B
  • B influences (or ‘causes’) A.
  • A and B are influenced by some other variable(s), C
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can two variables A and B be influenced by some other variable(s) C?

A
  • C may ‘cause’ both A and B

- A may lead to an increase in C which ‘causes’ B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define linear regression

A

Fitting a straight line to points on a scatterplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Where can the independent and dependant variables on a scatterplot be found?

A
  • Independent= X-axis

- Dependant= Y-axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are residuals?

A

The difference between the observed data and the predicted value from the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What assumptions are made in regression analysis?

A
  • The relationship must be approximately linear

- The residuals have to be normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a contingency table used for?

A

Examining the association between two categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What hypothesis test can be used on a contingency table? Describe it

A
  • Chi-squared test

- Comparing the contingency table observed with one expected if the null hypothesis were true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are conditions for the Chi-squared test?

A

The number of expected values in each of the four cells should be greater than 1. And in three of the four cells the expected value should be greater than 5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is continuity correction?

A
  • Yates’s correction
  • For small sample sizes the chi-squared test is too likely to reject the null hypothesis
  • The Chi-squared conditions still have to be met.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Fisher’s exact test?

A
  • When a contingency table fails to meet conditions

- More robust with small sample sizes