Key Concepts Flashcards
What are the two different types of data?
Categorical and quantitative
Name three types of categorical data and give an example of each
Binary (two levels) - e.g. Are you a smoker yes or no
Nominal (no ranking) - e.g. ethnicity
Ordinal(ranked)- e.g. height
Name two types of quantitative data
Discrete (isolated values) e.g. number of therapy sessions completed 1, 2, 3
Continuous (any values in interval) e.g. age, clinical scales
Give 4 factors that define a normal distribution of continuous data and include an example
Symmetrical
Most data close to the middle
Extreme values are rare
Mathematically helpful
E.g. height of men
How does positive/right skewed distribution of continuous data appear?
Most values are clustered around the left tail of the distribution while the right tail of the distribution is longer
How does negative/left skewed distribution of continuous data appear?
Most values are clustered around the right tail of the distribution while the left tail of the distribution is longer
What is a fat-tailed distribution?
Where extreme values are more likely
E.g. Distribution of wealth, 80/20 rule
What can make classical statistics difficult?
Fat tailed distribution
What do descriptive statistics describe?
Data collected
What cannot be used to make inference about the wider population as values in the true population could differ due to chance?
Descriptive statistics
What is typically used to describe quantitative (continuous) data?
A measure of the average (mean or median)
A measure of variability (standard deviation, quartiles)
A symmetric mean equalsβ¦
median
True or false:
Skewed data mean does not equal median
True
What is sensitive to outliers?
Mean
What is on the same scale as your data?
Standard deviation
What is not on the same scale as your data?
Variance
What are two main approaches to measure variance?
- SD and variance
- Percentiles
What is the difference between standard deviation and variance?
Variance is the average squared deviations from the mean, while standard deviation is the square root of this number
What is the empirical rule?
The percentage of values that lie within an interval estimate in a normal distribution: 68%, 95%, and 99.7% of the values lie within one, two, and three standard deviations of the mean,
What descriptive statistics are used to describe categorical data?
Binary and multinomial data:
Number and proportion in each category
Ordinal data:
Small number of categories: Number and proportion in each category
Larger categories for ordinal data: Median and 25th and 75th percentile
Mean (sd) β less common.
What is statistical inference?
Making statements about the population from the sample
What does statistical inference not address?
- If a study is biased
- If observed associations are causal