All stats Flashcards
Name the 2 broad categories data can be split into
- Categorical
2. Quantitative
What can categorical data be split into?
- Binary
- Nominal
- Ordinal
What can quantitative data be split into?
- Discrete
2. Continuous
What is binary data?
Data split into 2 categories
Give an example of binary data
Success/ failure
Yes/ No
What is nominal data
More than 2 categories
Give an example of nominal data
Eye colour
Hair colour
Hair type
What is ordinal data
Ordered data
Give an example of ordinal data
Happiness rating on a scale of 1-10
Customer server rating of 1-5
What is discrete data
Data in the form of numerical values
Give examples of discrete data
- Number of kids
2. Movie rating in stars
What is continuous data
Uninterrupted data
Give examples of continuous data
Height
Time
Weight
Name the best way to represent categorical data
In a bar chart
Name the best way to represent continuous data
Histogram or box plot
Define skewness
Skewness is a measure of probability distribution around the mean
Name the 3 ways be describe skewness
- Left skew
- Symmetrical
- Right skew
Describe the relationship between median and mean in a data set that is left skewed
Mean < median
Describe the relationship between median and mean in a data set that is right skewed
Mean > median
What is central tendency
Measures of specific points in a data set
Give examples of central tendency measures
Mean
Median
Mode
What are variation measures?
Measures of spread of variability
Give examples of variation measures
- Variance
2. Standard deviation
What is the standard deviation
A measure of the average scatter around the mean
greater the spread of data greater the SD
What is normal distribution used to describe?
Used to describe continuous data that forms a bell shaped symmetrical curve
What is a key characteristic of normally distributed data
Mean, median and mode are all equal
What symbol to we give to represent the mean?
μ
What symbol to we give to represent the SD
σ
Give examples of data that could be normally distributed
Height Ade Weight Bone density Exam scores BP
How do we check for normality
- Look at the histogram does it appear bell shaped
- Are mean, median and mode similar
- Do 2/3rds of the data lie within 1 sd from the mean
- Run numerical tests of normality
Describe a Q-Q plot for normally distributed data
- Follows a straight line
Give examples of numerical tests we can use to assess normality
- Kolmogorov-Smirnov
2. Shapiro Wilk
What requirements must a qualitative data set fulfil before we can calcite a central limit theorem for it?
Sample size must be larger than 30
What does μ+σ mean and what does it determine on a curve for normally distributed data?
mean+standard deviation
Determines the shape of the curve