Two-way tables, risks, & odds Flashcards
visualizing distributions
two-way tables
a two-way table is a two-dimensional table that displays the frequency counts for two categorical variables simultaneously according to a row variable and a column variable
marginal distribution
marginal distribution of one of the categorical variables in a two-way table is the distribution of values of that variable among all individuals in the table
- the percents for marginal distributions are found by dividing each row total or column total by the table total
conditional distributions
distribution of one variable depends or is conditional on the other variable
The percents for conditional distributions are found by dividing each row entry or column entry by their total
(class size is conditional on year of study)
conditional distributions in tables
two-way table conditional distribution is the row or column of the table
(frequency of 1st year courses - 1st year column)
(medium class size - medium class size row)
visualizing conditional distributions
variable 1: bald
variable 2: happy
you can select all happy people and take the distribution of baldness
or take all unhappy people and take the distribution of baldness
defining risk
the risk of an event is:
- proportion of times it happens in a group
- probability of it happening in a population
- percentage of the group for which it happens
Risk = Number of occurrences in group / Total size of group
calculating risk
risk of complication = 12/110 = 0.13
risk of complication for A = 10/100 = 0.10
risk of complication for B = 2/10 = 0.2
relative risk (RR)
ratio of risk of an event for two categories or groups of an explanatory variable; a comparison of risk
RR = Risk of event for group A / Risk of event for group B (usually treated as a baseline risk)
calculating relative risk
risk of complications for A = 10/100 = 0.10
risk of complications for B = 2/10 = 0.2
relative risk of complications = risk for B / risk for A = 0.20/0.10 = 2 (x2 more risk)
(usually the baseline risk is lesser than the other risk unless stated otherwise)
increased risk (IR)
percentage change in risk for one group in comparison to another (e.g. baseline) group
Increase in risk = (change in risk / comparison group’s risk) x 100%
calculating increased risk
risk of complications for A = 0.10
risk of complications for B = 0.20
increased risk of procedure B compared to A = (risk B - risk A / risk A) x 100%
= (0.20 - 0.10 / 0.10) x 100%
= 100%
odds
compare the number of units with a quality to the number without the quality, expressed as “to 1”
Odds = Number of occurrences in group / Number of non-occurrences in group
calculating odds
risk of complications for A = 10/100 = 0.10
odds of complications for A = 10/90 = 0.11 (0.11 to 1)
odds of no complications for A = 90/10 = 9
Simpson’s paradox
when studying the relationship between two variables, there may exist a lurking variable that creates a reversal in the direction of the relationship when the lurking variable is ignored, as opposed to the direction of the relationship when the lurking variable is considered
lurking variables creates subgroups and failure to take these subgroups into consideration can lead to misleading conclusions regarding the association between the two variables
an association that holds for all of several groups can reverse direction when the data are combined to form a single group - this reversal is called Simpson’s paradox
example: road and helicopter, lurking variable: seriousness of injury
reversal when seriousness variable is observed separately