Statistics Flashcards
Risk definition
The probability of an event happening in a given period of time
Odds definition
The ratio of the probability an event will happen to the probability of it not happening
Absolute risk definition
The likelihood of an event occurring under specific conditions (risk of developing disease over time)
Relative risk definition
Likelihood of an even occurring when in comparison to another event (comparing the likelihood of getting a disease when exposed vs not being exposed)
Prevalence definition
Proportion of a population with a characteristic (disease) at a particular point in time
What information do you need to measure the outcome of absolute risk?
Incidence
Prevalence
Odds
Hazard ratio
What information do you need to measure the outcome of relative risks?
Risk ratio
Odds ratio
Hazard ratio
Risk example question: “the risk of obesity in bull terriers: total 334 sampled & 20 confirmed obese”
Work out the risk.
20/334 = 0.0599 = 6% of bull terriers obese
Odds example question: “20 obese SBT out of 334 dogs” work out the odds.
334-20 = 314 not obese
Odds = 20/314 = 0.064 = 6%
What are the pros and cons to using risk ratios?
More accurate reflection of population
Easier to interpret
- harder to calculate
What are the pros and cons to using odds ratio?
Simple to calculate
Can make decisions from results
Info of one outcome vs another
- can’t estimate prevalence of disease
Confidence interval definition
95%
Range of values which contain the true parameter value
Within 2 SD (standard deviations)
If repeated would have same results
Only with normal distributed data
What is the P value to be considered statistically significant?
<0.05%
What graph should be used to display categorical data?
Bar chart
What graph should be used to display continuous data?
Histogram
What graph should be used to display median and interquartile ranges?
Box & whisker plot
What can graphs be used to identify?
General shape of data (bell curve; positive correlation etc)
Centre of distribution (avg.)
It’s spread
Outliers
Relationships between 2 variables
What percentage does 1SD equivalate to?
68%
What percentage does 2SD equivalate to?
95%
What percentage does 3SD equivalate to?
99%
What is a dependent variable? & what axis is it plotted on?
The thing you measure
Y axis
What is an independent variable? & what axis is it plotted on?
The thing you are changing
X axis
What is the Null hypothesis?
States there is no effect or difference (and assumes the scientific hypothesis is true)
If we have normally distributed data, what test do we use to find out if they are from two different populations?
Students t-test
If we have not normally distributed data, what test do we use to find out if they are from two different populations?
Mann-Whitney rank test
What test do we use for parametric data?
Students t-test
What test do we use for non-parametric data?
Mann Whitney rank test
What is parametric data?
Normally distributed data
What is non-parametric data?
Not normally distributed data
What is a type I error?
Rejecting the null hypothesis when it is true
(Thinking something is happening when it isn’t)
What is a type II error?
Accepting the null hypothesis when it is not true
(Thinking nothing is happening when it is)
What test do you use for categorical independent data and the dependant variable is continuous?
T test
Mann whitney
What test do you use for observations that are paired?
Paired t test (for 2 groups)
General linear (more than 2)
What test do you use where both dependant and independent variables are continuous?
Linear regression (parametric)
Spearman rank test (non-parametric)
What test do you use where the dependant variable is categorical?
Chi square
What is the incidence rate?
New cases that occur over time
(Number of new cases/total animal time at risk)
What is the denominator population?
The number of individuals in the population at the start of the observation period, I.E., how many cows do you have in total?
What is prevalence?
A proportion of a population with a certain disease at a particular point in time
What is passive surveillance?
Uses existing data
No defined population
No defined unit of measure
Misses subclinical cases
Not all owners will take to vet to report
Owners may not allow samples
Relies on reporting and routine data
What is active surveillance?
Active looking/collection of disease info
May miss new diseases - looking for specific ones
Screen unwell & healthy ones
Systemic detection of cases
Comparable data time or area
Expensive and time consuming
What is random sampling?
Equal approach to ensure every member of the population has an equal chance of being included
What is stratified random sampling?
Sampling randomly within defined strata in the data set - every 7th cat
What is standard error?
A measure of uncertainty in an estimate from a sample
What is the standard error of the mean?
How close the mean of your sample is to the true mean of the population
(Mean gets smaller as sample size increases)
What is bias?
A systematic error that leads to results that are consistently too large or too small
What is the confidence interval?
Range of values that are believed to contain the trite parameter value (should be 95%)
Why design a study?
To be sure about efficacy
To determine a risk factor
Applicable to rest of the population
Avoid bias
What are the 4 main study types?
Cross sectional
Cohort
Case control
Randomised controlled trials
What is a cross sectional study?
Surveys, lab experiments
Snapshot of information at one point in time
Can calculate prevalence, relative risk and attributable risk
Cannot differentiate cause and effect
What is a cohort study?
Follow target group for period of time
Compare outcomes in exposed and non-exposed environment
Measures incidence rate, relative risk, attributable risk
Monitor several diseases simultaneously
Estimate disease incidence
Determines causality
Need large population
Long time
Costly
What is a case control study?
2 groups: cases & controls
Accurate & consistent case definition
Calculate using odds ratio
Can study rare diseases
Get background info quickly
Liable to bias
Can’t estimate disease incidence
What is a randomised controlled trial?
Planned experiment
See if treatment has an effect
Population must be cases
2 groups: treated or non treated
What are the 3 types of randomised controlled trials?
Single blind: don’t know what treatment they receive
Double blind: operator also doesn’t know
Triple blind: statistician also doesn’t know
What is the hierarchy of evidence tool?
It shows that some studies had better weighted evidence compared to others
What are two types of bias?
Selection bias
Confounding bias
What is selection bias?
Occurs before the study begins
Sample selection doesn’t represent target population
What are examples of selection bias?
Choice of comparison groups
Non response bias
Missing data
Loss to follow up
Healthy worker effect
What is confounding bias?
Mixing together the effects of two or more factors that are related to each other and the outcome.
Will chase incorrect classification of outcome and exposure
What is an example of confounding bias?
Diagnostic test with imperfect sensitivity or specificity
How are diagnostic tests’ performance measured?
Sensitivity
Specificity
What is sensitivity?
The probability that an animal with the disease is identified by the test
The number of positives detected
What is specificity ?
The probability that an animal without the disease is tested negative by the test