Lecture 7 - Non parametric statistics and measures of association Flashcards

Question 1

Q

Parametric vs Non parametric

Answer

A

• Parametric data:
– Assumes normal distribution, homogenous variance, and data sets are typically ratio or interval.
– Can draw more conclusions.
• Non-Parametric data:
– No assumption on distribution or variance relationship, and data sets are typically ordinal or nominal.
– More simple and less affected by outliers.

Question 2

Q

Correlation and Correlation Coefficient

Answer

A

• Technique for investigating the relationship between two numerical variables
• A correlation coefficient is a measure of the relationship between two numerical measurements
– Magnitude of relationship – Direction of relationship – Bivariate distribution

Question 3

Q

Positive and negative correlation

Answer

A

• Positive Correlation (Direct)
– Present when high values in one variable are associated with high values of another variable or vice versa
• Negative Correlation (Indirect)
– When one values on one variable are associated with low values of other variable or vice versa

Question 4

Q

Correlation: Scatterplot

Answer

A

• Scatterplot
– A two dimensional graph displaying the relationship between two numerical characteristics of variables
• Whether there is an association between variables
– What the association looks like (linear? nonlinear?)
– The trend of the association (positive, negative)

Question 5

Q

Pearson correlation coefficient (r)

Answer

A

• Measures the strength of linear association between two quantitative variables
– The r value has no units
• Level of measurement of the data for the two variables are either interval or ratio scale
Interpretation of r:
Negative correlation gets stronger as it approaches zero
Positive correlation gets stronger as it approaches 1

Question 6

Q

Usefulness of scatterplot

Answer

A

We learn the truth by simply looking at the graphs:
The upper-left graph looks what we may have expected from the regression output: a straight-line relationship with some scatter about the best line.
The upper-right graph shows a strong relationship between x and y, but it is NOT linear.
In the lower-right graph, it doesn’t make any sense to fit a line since there is essentially no variability in the x values.
In the lower-left graph, there is a strong linear relationship with the exception of one outlier.
The moral of this example is: ALWAYS FIRST GRAPH YOUR DATA and don’t rely solely on summary output.

Question 7

Q

Spearman Correlation Coefficient (rs)

Answer

A

The non parametric equivalent of Pearson product moment correlation
Measures the strength of association between two ranked variables
The Spearman correlation can be used when the assumptions of the Pearson correlation are markedly violated.
A second assumption is that there is a monotonic relationship between your variables.
It is calculated by first ranking the data for each quantitative variable and then applying the linear correlation coefficient formula on the ranked data.

Question 8

Q

Correlation and causality

Answer

A

Correlation does not imply causation

• Example:
– MMR vaccination and autism spectrum disorders
– Gender and IQ
– Alcohol and lung cancer

Question 9

Q

Regression analysis

Answer

A

• It is a common way of estimating the relationship among variables
– E.g.: Given the age of an individual, can we estimate their income levels?
– Also, can we use the age of the individual to predict their income levels?
Liner regression is the most basic and common type of predictive analysis
– At the centre of the regression analysis is the task of fitting a single straight line through a scatter plot
• Regression line

Question 10

Q

Non parametric statistics for hypothesis testing

Answer

A

• The population median instead of the population mean μ

Question 11

Q

Sign test (+,-)

Answer

A

• Testing hypothesis concerning the median

H0: n = n0
H1: n /= n0

• If the null hypothesis is true, there is approximately an equal number of observations greater and less than the median

Lecture 7 - Non parametric statistics and measures of association Flashcards

(11 cards)