Statistics & Data Analysis Flashcards
Define statistics
The practice of collecting and analysing numerical data in large quantities, especially to infer proportions in a whole from those in a representative sample
Why do we need statistics?
To distinguish between randomness and systematic features in large datasets
What are the main tasks of statistical analysis?
Design of experiments and data collection, data description, tests of hypothesis, model fitting
What can probability often be interpreted as?
a population relative frequency
What does Probability theory use to assess uncertain situations mathematically
random variables and probability distributions
Define odds
The probability that the event will occur divided by the probability that it will not
What is the equation to calculate Odds and P
odds = p/(1-p). p = odds/(1 + odds)
What is sensitivity?
The likelihood of true positives
Equation for sensitivity
Sensitivity = TP / (TP + FN)
What is specificity?
The likelihood of true negatives
Equation for specificity
TN / (TN + FP)
What are the different study designs?
Experimental, observational , Prospective vs retrospective, cross-sectional vs longitudinal, representative vs random sample
What are some key points of Experimental study?
Randomised controlled trial, intervention group vs placebo group
What are some key points about Observational study?
No intervention, cohort study(group of subjects linked in some way), case control study
What are two types of qualitative data?
Nominal and Ordinal
What is the purpose of Descriptive statistics
To describe the distribution of a phenomenon such as height in a population
What do Descriptive statistics show measures of
Location, Dispersion of data(spread), Association(for two variables)
What does Standard error of the mean(SEM) measure
variability across multiple samples in a population
What does Skewness quantify
the degree of distortion from a normal distribution
For Skewed distributions, is it better to report the mean or the median?
Median
What is Kurtosis and what does it describe?
Peaked distribution, how heavily the tails of a distribution differ from a normal distribution
What term describes dependencies between two variables?
Correlation
Two commonly used measures of correlation
Pearson and Spearman
When is Pearson correlation used?
For quantitative data, linear relationship
When is spearman correlation used?
Quantiatave or ordinal data, Monotonic relationship, good for non-normal data, based on ranks of the data