Lecture 1 - Mean, Mode, etc. Flashcards
What is continuous data?
Data that can be divided up an infinite number of times. e.g. age decades -> years -> months -> days, etc.
What is discrete data?
Data that cannot be divided up. aka integer data. e.g. # of fractures someone in a car accident has
What is nominal data?
Data given arbitrary numeric labels. Used merely for identification. e.g. race. white = 0, black = 1, etc.
What is ordinal data?
Ordered rankings with numeric labels. Numbers have meaning in relation to one another, but no intrinsic value. e.g. APGAR score.
What is categorical data?
A type of ordinal data where higher numbers are in fact better, but the actual difference between is not known. e.g. pain scores.
How is mean denoted in equations?
” X bar” (an X with a line on top of it)
What is a major weakness of the mean?
It is strongly influenced by outlying values
What are strengths of the median?
Impervious to outlying values
What is a disadvantage of the median?
It only uses a small portion of the data set and does not give you an idea of the full range of values
What is variance?
A measure of variability around the mean
When should you use the mean vs. mode?
Generally use the mean when variance is low and the mode when variance is high. If the values are well spread out, report both.
What is the standard deviation, numerically?
The square root of variance (converts variance back to a meaningful number for our actual data)
What is large s?
Large spread. No hard and fast rule. Generally when SD is 50% (or more) as big as the mean.
What is a small s?
Anything <25% = small
What is the coefficient of variation (CV)?
Used to describe SD in relation the mean as a percentage. CV = (SD/mean) x100
What direction is this data skewed?

Right, because the outlier is on the right.
What are the significant values portrayed in this box plot?

Bottom of box = 25th percentile
Middle of box = 50th percentile
Top of box = 75th percentile
Whiskers typically 1.5 IQR or min/max
What is sample probabilities?
Using sets of data and calculate probabilities on them that we hope are unbiased and can be extrapolated to a larger population of interest.
What is the probability that any 2 events will occur?
P (event 1 OR event 2) =
P (event 1) + P (event 2) - P (both events occur)
If both events CANNOT occur (e.g. both being male and female), do not subtract P
What does this phrase mean:
P (reduced FEV1| elevated IL-8) = 11/22 = 0.50?
What is the probability of reduced FEV1 GIVEN the people who have elevated IL-8 (usually limits the total sample size to a subset of the population) Also can be phrased “conditioned on”
When are two probabilities independent?
When the probability of something happening alone is the same as under a certain condition. e.g. probability of seeing a blue car after having seen a red car is just the probablity of seeing a blue car.
If two probabilities are independent, what is the probability of getting one twice in a row?
Multiply each probability.
P (event1 AND event2)=P (event1) x P (event2)
oFor a simple example, the probability of getting two heads in two tosses of a fair coin is just P (heads) x P (heads) = 0.50 x 0.50 = 0.25