Regression - Assumptions Flashcards

1
Q

There are four “assumptions” that underpin a Pearson’s correlation. If any of these four assumptions are not
met, analysing your data using a Pearson’s correlation might not lead to a valid result.
ASSUMPTION 1

A

The two variables should be measured at the continuous level.

Examples of such continuous
variables include height (measured in feet and inches), temperature (measured in °C), salary (measured in
dollars/INR), revision time (measured in hours), intelligence (measured using IQ score), reaction time (measured
in milliseconds), test performance (measured from 0 to 100), sales (measured in number of transactions per
month), and so forth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ASSUMPTION 2

A

There needs to be a linear relationship between your two variables. Whilst there are a number
of ways to check whether a Pearson’s correlation exists, we suggest creating a scatterplot using Stata, where
you can plot your two variables against each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ASSUMPTION 3

A

There should be no significant outliers. Outliers are simply single data points within your data
that do not follow the usual pattern (e.g. in a study of 100 students’ IQ scores, where the mean score was 108
with only a small variation between students, one student had a score of 156, which is very unusual, and may
even put her in the top 1% of IQ scores globally).

Pearson’s r is sensitive to outliers, which can have a great impact on the line of best fit and the Pearson
correlation coefficient, leading to very difficult conclusions regarding your data. Therefore, it is best if there are
no outliers or they are kept to a minimum. Fortunately, you can use Stata to detect possible outliers
using scatterplots.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ASSUMPTION 4

A

Your variables should be approximately normally distributed. In order to assess the statistical
significance of the Pearson correlation, you need to have bivariate normality, but this assumption is difficult to
assess, so a simpler method is more commonly used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly