Key Concepts Flashcards
What are the two different types of data?
Categorical and quantitative
Name three types of categorical data and give an example of each
Binary (two levels) - e.g. Are you a smoker yes or no
Nominal (no ranking) - e.g. ethnicity
Ordinal(ranked)- e.g. height
Name two types of quantitative data
Discrete (isolated values) e.g. number of therapy sessions completed 1, 2, 3
Continuous (any values in interval) e.g. age, clinical scales
Give 4 factors that define a normal distribution of continuous data and include an example
Symmetrical
Most data close to the middle
Extreme values are rare
Mathematically helpful
E.g. height of men
How does positive/right skewed distribution of continuous data appear?
Most values are clustered around the left tail of the distribution while the right tail of the distribution is longer
How does negative/left skewed distribution of continuous data appear?
Most values are clustered around the right tail of the distribution while the left tail of the distribution is longer
What is a fat-tailed distribution?
Where extreme values are more likely
E.g. Distribution of wealth, 80/20 rule
What can make classical statistics difficult?
Fat tailed distribution
What do descriptive statistics describe?
Data collected
What cannot be used to make inference about the wider population as values in the true population could differ due to chance?
Descriptive statistics
What is typically used to describe quantitative (continuous) data?
A measure of the average (mean or median)
A measure of variability (standard deviation, quartiles)
A symmetric mean equalsβ¦
median
True or false:
Skewed data mean does not equal median
True
What is sensitive to outliers?
Mean
What is on the same scale as your data?
Standard deviation
What is not on the same scale as your data?
Variance
What are two main approaches to measure variance?
- SD and variance
- Percentiles
What is the difference between standard deviation and variance?
Variance is the average squared deviations from the mean, while standard deviation is the square root of this number
What is the empirical rule?
The percentage of values that lie within an interval estimate in a normal distribution: 68%, 95%, and 99.7% of the values lie within one, two, and three standard deviations of the mean,
What descriptive statistics are used to describe categorical data?
Binary and multinomial data:
Number and proportion in each category
Ordinal data:
Small number of categories: Number and proportion in each category
Larger categories for ordinal data: Median and 25th and 75th percentile
Mean (sd) β less common.
What is statistical inference?
Making statements about the population from the sample
What does statistical inference not address?
- If a study is biased
- If observed associations are causal
What are two different approaches to statistical inference?
- Frequentist
- Bayesian
What approach to statistical inference is more common in medical and psychological research at the moment?
Frequentist
What features define a frequentist approach to statistical inference?
- Using p-values, confidence intervals, maximum likelihood
- Inference is based on the observed data.
- Make probability statements about the data, given the value of a parameter: βThe probability of observing data as extreme as this, given there is no treatment effect is 3%.β
- Different people will get the same results applying the same analysis to the same data.
What features define a bayesian approach to statistical inference?
- Credible intervals, priors, posterior probability
- Incorporates prior beliefs into statistical inference
- Allows probability statements about parameters, given the data (and prior beliefs) e.g. βGiven the data we have observed, there is a 97% chance the treatment is effectiveβ
- Different people will get different results depending on their prior beliefs
In bayesian and frequentist statistics conclusions will be similar if..
The sample size is large enough and the strength of prior beliefs weak.
What is used as a measure of uncertainty when using a frequentist approach?
- Confidence interval
- p-value:
What is an alpha-level confidence interval?
An interval of uncertainty around an estimate for a parameter
Confidence intervals areβ¦
Intervals that, under repeated sampling, would contain the true value alpha percent of the time .
What do we typically calculate?
95% confidence intervals
What is the standard error?
The standard deviation of an estimateβs sampling distribution
Often the standard error can be calculated from the standard deviation of the population the statistic is being calculated on
True or false
True
How can you calculate the standard error for a mean?
Divide standard deviation by the square root of the sample size
The standard error will, in most cases..
Get smaller as n increases
The standard deviation does change systematically with sample size
True or false
False
The standard deviation does not change systematically with sample size
If we assume our estimate is from a normal distribution how can we calculate the confidence interval?
95% πΆπΌ=ππ π‘ππππ‘π Β±1.96Γπ π‘ππππππ πππππ
E.g. for a mean 95% πΆπΌ=ππ π‘ππππ‘π Β±1.96 (π π‘ππππππ πππ£πππ‘πππ)/βπ
As sample size increases confidence interval becomes what?
Smaller
For means what is often used instead of a normal distribution and what does this often lead to?
t-distribution
This leads to a different multiplier for the standard error to 1.96, usually fairly close to 2
What is a p-value?
The probability of observing the data, or data more extreme given the parameter of interest takes a given value.
What is a null hypothesis?
The value the parameter is set to take
Typically the null hypothesis is for no effect or association.
p-values reported from models are typically..
for parameters to be equal to zero
What is used to make decisions (or inference) about the value of a population parameter?
Statistical test of hypothesis
What does a statistical test of inference consist of?
A statistical test of hypothesis consists of five parts
1 . The null hypothesis, denoted by H0
- The alternative hypothesis, denoted by H1
- One tailed: H1 d: parameter > H0
- Two tailed: H1 : parameter β H0
Two tailed p-values are almost always used - The p-value
- A significance threshold (0.05)
When would we reject the null hypothesis?
If the p-value is below the significance threshold we reject the null hypothesis and conclude that the alternative hypothesis is true
If the p-value is not below the significance threshold we do not have evidence to reject the null hypothesis.
Why is this?
This does not mean the null hypothesis is true
A non-significant p-values tells us we do not know much
What does βp < 0.05β mean?
There is evidence that there is a difference: If there was no difference weβd have been unlikely to see the data we did.
What does p > 0.05 mean?
There is insufficient evidence to conclude there is a difference. If there was no difference our results would not be unexpected.But we cannot rule out a difference.
If a p-value is not statistically significant we cannot conclude that there is no difference.
What are two errors from hypothesis tests?
Type 1 error (Ξ±)
Type 2 error (Ξ²)
What is a type 1/a error?
Falsely conclude there is a difference
Controlled with significance threshold
If the significance threshold is 0.05 we expect a type 1 error rate of 5%
What is a Type 2 error (Ξ²)?
Fail to conclude that the there is evidence for a difference when there is a true difference
Sample size, magnitude of true difference, and variability of data effect type 2 error rates
Power = 1 - Ξ²
Power is the probability of concluding there is a difference, when true.
Low powered test: Unlikely to be significant even if there is a difference
What can you determine when given a -1 alpha level confidence interval?
Whether the p-value is statistically significant at the Ξ± level.
i.e. given a 95% confidence interval you can tell if the p-value would be significant at the 5% level
If the confidence interval contains the null hypothesis, p > 0.05
If the confidence interval does not contain the null hypothesis p <0.05
What is an example of this?
For example if the null hypothesis is 0:
95% CI of -1.1 to -0.1 would correspond to a statistically significant result
-1.1 to 0.1 would correspond to a result that was not statistically significant.
What causes type 1 error?
Multiple testing & p-hacking
What enhances the issue of multiple testing and p-hacking?
-Selective reporting enhances the problem, eg:
Only report significant results and ignore non-significant results
- Place more emphasis on significant results
-Selective reporting can occur at the study level: studies with non-significant findings are less likely to be published
What are solutions to multiple testing and p hacking?
- Bonferroni correction: divide significance threshold by number of tests
- This can be conservative
- Leads to larger sample sizes being required - Pre-specification of outcomes, analysis methods, and studies
- Can specify primary outcome β stops emphasis being shifted to significant results
- Makes visible the number of tests conducted
- Compulsory in randomised controlled trials e.g. All trials campaign http://www.alltrials.net/
- Harder to do in more exploratory studies
What are some reasons for banning the p-value?
- Can be manipulated with multiple testing
- They are often misinterpreted
People often interpret p > 0.05 as meaning βno effectβ - Over reliance on significance thresholds
p = 0.04 given wildly different interpretation to p = 0.06 - Bayesian argument:
p-value tells us probability of observing the data given no effect
What we want to know is probability of an effect. This can only be achieved with Bayesian inference.