6. Data Analysis And Ch. 4, 7 of E-book Flashcards
On a graph with a normal distribution, where is the mean
In the middle
- the normal distribution is symmetric about its mean
What is the standard deviation on a normal distribution
How broad the bell shape is
What is the total area under the normal distribution curve defined as
1.0 = One whole unit
How to work work out a standard normal
Z= (x - the mean) / standard deviation
What values does the Pearson Correlation Coefficient, r, lie between
1 and -1
If r is larger than zero, what is the correlation
Positive correlation
If r is less than zero, what is the correlation
Negative correlation
If r=0 what is the correlation
No correlation
If r=1 what is the correlation
Perfect postive
If r=-1 what is the correlation
Perfect negative
When not to use Pearson correlation
There is a non-linear relationship between variables (see Figure (a) below).
There are outliers (see Glossary)
There are distinct sub-groups, for example, if we mix two samples together such as healthy controls and disease cases (see Figure (b)).
One or both of the variables is not normally distributed.
One or both of the variables is non-numeric.
When can Spearman Rank Correlation Coefficient, Rho, be used
This correlation coefficient can be used when the data is not normally distributed, when one or both of the variables are ordinal, or when the sample size is small.
What is linear regression
term used to describe fitting a straight line to points on a scatterplot.
What are residuals
The residuals are the difference between the observed data and the predicted value from the model.
Give an assumption and requirement of a regression analysis
Relationship must be approximately linear
The ‘residuals’ have to be normally distributed