Stats definitions Flashcards
Define ‘Statistical Inference’
To make inferences about a population from data contained within a sample
Define ‘Medical Statistics’
Assesses the size and strength of the influence of one or more exposure variables (risk factors or treatments) on the outcome variable of interest (such as occurrence of disease or survival)
Define ‘Evidence-Based medicine’
Appraises the evidence based on the average effect of a treatment assessed on a large number of people and judging it’s relevance to the management of a particular patient.
Name the 5 steps in the PPDAC cycle in order
- Problem
- Plan
- Data
- Analysis
- Conclusion
What is the difference between the ‘Treatment group’ and a ‘Control group’
The control group does not include the thing being tested on while the treatment group does. i.e 1 group has a heart valve implanted while the other does not.
Define a ‘Variable’
Characteristic or attribute that can take on different values.
Define a ‘random variable’
A variable whose values occur due to some random process
Define ‘Data’
Observed values that the variable takes on
Define ‘Datasets’
Collections of data on several variables
Define a ‘Population’
The complete group of subjects that are being studied
Define a ‘Sample’
Group of subjects chosen from the population i.e a subset of the population
Name the 2 subsets of Variables
- Numerical
- Categorical
Name the 2 types of numerical variable explaining each with an example
- Continuous - A variable that can take on any value i.e temperature time distance
- Discrete - A variable that is counted in steps numbers i.e counting sheep before you sleep or the amount of money collected
Name the 2 types of Categorical variable explaining each with an example
- Ordinal - Variables with a pre-existing order however can’t be compared mathematically i.e Education as masters>bachelors>nat 5
- Nominal - Variables that have no set order i.e ethnicity can’t say one is superior so it’s undefined
Define a ‘population parameter’
a quantity or statistical measure that, for a given population, is fixed and that is used as the value of a variable in some general distribution or frequency function to make it descriptive of that population
Define a ‘Sample Statistic’ (also known as an estimator)
A value that can vary but is known
Define an ‘explanatory’ and ‘response’ variable
- Explanatory variable - a fixed value
- Response variable - a random variable that might be affected by the explanatory variable
Define a ‘Confounding Variable’
A variable that is correlated with both explanatory and response variables
Define ‘Simple Random Sampling (SRS)’
Srs is where every unit within the population has, in theory, an equal chance to be included in the sample.
Define ‘Stratified sampling’
Divide a group into smaller groups ‘strata’ based on some group characteristic. Then another sampling method is employed within each stratum.
Define ‘Systematic Sampling’
Take every kth unit when sampling.
Define ‘Cluster Sampling’
Population is split into many groups that are representative of a population called clusters and a fixed number of clusters are sampled.
Define an ‘Observational Study’
A study where researchers simply observe a variable of interest with no intervention .
Explain The difference between a ‘Cross-Sectional study’ and a ‘longitudinal study’
A cross sectional study studies a group of individuals at a SPECIFIC point in time.
A longitudinal study studies a group of subjects over a period of time and measurements are recorded at set time points.
Define a ‘Cohort Study’
Also known as a prospective study, this is when a cohort is divided into groups by the factor of interest and other factors, after some time they are inspected to see what changed in these groups.
Define a ‘case control study’
Also known as a retrospective study, is when a group that had a disease are compared to a group that did not to ascertain what caused it.
Name the 3 types of ‘Categorical graphs’ and explain how many groups can be studied for each
- Bar chart - single sample
- Grouped bar chart - two or more groups
- Pie chart - single sample
Name the 3 types of ‘Numerical graphs’ and explain how many groups can be studied for each
- Histogram - single sample
- Box-plot/Dot pot - single or groups
- Scatter plot - relationship between variables
Define a ‘Partition’
A partition is a set of Disjoint events/outcomes and the probability of these events sums to 1
Define an ‘Experiment’
Any process that requires some action to be performed and has an outcome that can be recorded
Define an ‘Outcome’
Any single result of an experiment
Define a ‘Sample space’
A set(collection) of possible outcomes of an experiment
Define an ‘Event’
A collection or set of outcomes from the sample space. Or as subset of the sample space
Define a ‘Null event’
An event that doesn’t have any outcomes
Define the ‘Complement of an event’
The sum of the outcomes in the sample space where the event did not occur.
Explain the difference between an ‘independent’ set of events and a ‘dependent’ set of events
Independent events have no influence on the probability of each other given one has occurred dependent are the opposite and P(x|y) =! P(x)
Define ‘Mutually exclusive’ events and state if independent variables can be described as this
Mutually exclusive events are 2 events that cannot occur within 1 outcome e.g rolling an even AND odd number .
Define ‘Mutually exclusive’ events and state if independent variables can be described as this
Mutually exclusive events are 2 events that cannot occur within 1 outcome e.g rolling an even AND odd number .
Independent variables are NEVER mutually exclusive .
Mutually exclusive events are however always dependent BUT NOT ALL DEPENDENT events are mutually exclusive
Define a ‘partition’
A partition of a sample space is a set of disjoint events/outcomes
Define a ‘Numerical random variable’
A numeric representation of the result from an experiment
State the 3 rules of a ‘Probability Mass function’
- The outcomes in the sample space are disjoint
- 0 < p(x) 1 for each outcome x
- sum of p(x) = 1
State the 3 properties of the ‘cumulative distribution function’
- F(- inf) = P(X <= - inf) = 0
- F(inf) = P(X <= inf) = 1
- If a<=b then F(a) <= F(b) since F is non decreasing
Define the ‘Discrete uniform distribution’
Assigns equal probabilities to each outcome in the same space
When can the Bernoulli Distribution be applied
When the sample space can be divided into successes and failures
Define a “continuous random variable’
A random variable whose sample space has infinite outcomes
What are the properties of a good estimator
Unbiasedness - i.e. when the sample statistic expectation is equal to the parameter being estimated
Consistency - When the value of the estimator tends towards the value of the parameter with an increase in sample size
What is ‘Sampling Error’ a measure of ?
Measures how much the estimator tends to vary from sample to sample
define a ‘population proportion’
A fraction of the population that has a certain characteristic