eM2 Normality Flashcards
What is normality?
Normality measures the central tendency and dispersion of data and is used to decide how to describe the properties of large data-sets i.e. the descriptive statistics which are presented instead of the raw data.
What is the difference between normal, kurtosis and skewness?
What is a skewed graph?
Skewed data which is a-symmetric with several data points in the high or low end of the range and an uneven tail. A left-skewed distribution has a long left tail. A right-skewed distribution has a long right tail.
What is a negative skew?
Left-skewed distributions are also called negatively-skewed distributions. That’s because there is a long tail in the negative direction on the number line. The mean and median are also to the left of the peak
Briefly describe the placements of mean, median and mode on right and left skewed data
What is a positive skew?
Right-skewed distributions are also called positive-skew distributions. That’s because there is a long tail in the positive direction on the number line. The mean and median are also to the right of the peak.
What is kurtosis?
Kurtosis describes data that are heavy-tailed or light-tailed relative to a normal distribution.
What are heavy and light tails?
Data sets with high kurtosis tend to have heavy tails, or outliers that create a very wide distribution. Data sets with low kurtosis tend to have light tails, or lack of outliers that create a very narrow distribution.
How can normality be assessed?
Normality can be visually assessed by evaluating a frequency bar-chart or histogram.
Shapiro-Wilks test: used to test for normality with small sample sizes (n<50)
Kolmogorov-Smirnov: used to test for normality with large sample sizes (n>50)
What is the important p-value for normality tests?
A p-value <0.05 is considered to indicate a violation of normality (i.e. the data are NOT normally distributed).
What are descriptive statistics?
Descriptive statistics are used to categorise large data-sets into a tangible format. The most basic type of descriptive statistic is a measure of central tendency, which can be either the mean, mode or the median.
How do you measure central tendency?
Mean, median and mode
How do you measure dispersion?
Standard deviation
Variance
Range
What is range?
Difference between largest data value and smallest data value. Measures how far a set of number are spread out from their average value.
What is variance?
A measure of the spread of the numbers away from the mean value. It is calculated by working out the average of the squared differences from the mean. You are not required to know how to calculate this for the RDS course.