Summary statistics I Flashcards
What is a variable?
denoted anything that varies within a dataset, categorical or numeric
Difference between categorical and numeric values?
Categorical: binary/ ordinal/ nominal
Numerical: discrete/ continuous
What is binary data?
Type of categorical data
Data that only has two categories
e.g. positive/negative
What is ordinal data?
Type of categorical data
Categories with natural order
e.g. stages of cancer/ levels of pain
What is nominal data?
Type of categorical data
Categories with no natural or universally agreed order
e.g. blood group
What is discrete data?
Type of numeric data
Observations that can only take certain numerical values
e.g. number of children
What is continuous data?
Type of numeric data
Observations can take any value within a range
e.g. height/ temperature
Why is categorisation of continuous variables sometimes frowned upon? e.g. age in yrs into age categories
May lead to loss of information
Especially if using arbitrary thresholds
What is prevalence? (context categorical data)
Number of existing cases in a population at a defined timepoint
What is incidence? (context categorical data)
Number of new cases in a population over a defined period
What is used to describe the prevalence of a condition or the probability of an event?
Proportion
Number experiencing the event divided by the total, often reported as a percentage or given per quantity of people e.g. per 1000
Difference between prevalence and incidence?
prevalence: existing cases, defined time point
incidence: new cases, defined period of time
Prevalence includes all cases but incidence only includes new cases
Prevalence is dependent on both incidence and duration of the event