Lecture 5 - Biostatistics Part 1-2 Flashcards
What are the Measures of Central Tendency?
Mean - the “average” – sum of the set divided by the number in the set
Median – the middle point (arrange the data smallest to largest, then find the middle point)
Mode – the score that occurs most frequently in a set of data
—-May have two most common values = “bimodal distribution”
the “average” – sum of the set divided by the number in the set
Mean -
– the middle point (arrange the data smallest to largest, then find the middle point)
Median
– the score that occurs most frequently in a set of data
Mode
—-May have two most common values = “bimodal distribution”
Mode
—-May have two most common values =
“bimodal distribution”
- The point/score at which 50% of scores fall below it and 50% fall above it
Median
This is the most general and least precise measure of central tendency
When two values occur the same number of times – Bimodal distribution
Mode - most frequently occurring value
look at mean median mode images on slide 10
!
Standard deviation
a measure of variation of scores about the mean
Variance compared to…
standard deviation
quantifies the amount of variability, or spread, around the mean of the measurements.
To calculate: take each difference from the mean, square it, and then average the result
Variance (σ2 )
To calculate the variance:
take each difference from the mean, square it, and then average the result
: a measure of variation of scores about the mean
Standard deviation (σ)
To calculate the standard deviation:
take the √ of the variance
(the “average distance” to the mean)
In practice, the standard deviation is used more frequently than the variance.
Primarily because the standard deviation has the same units as the measurements of the mean.
When comparing two groups, the group with the larger standard deviation exhibits a greater amount of _____ while the groups with smaller deviation has less _______.
variability (heterogeneous)
variability (homogeneous)
Empirical rule for data (68-95-99) - only applies to a set of data having a distribution that is approximately bell-shaped:
Approximately 68% of all scores fall with 1 standard deviation of the mean
Approximately 95% of all scores fall with 2 standard deviations of the mean
Approximately 99.7% of all scores fall with 3 standard deviations of the mean
Scatterplots:
A useful summary of a set of ________-
bivariate data (two continuous variables)
Scatterplots:
Gives a good visual picture of the relationship between the two variables, and aids the interpretation of the _____
correlation coefficient or regression model.
the statistic that summarizes the relationship between the variable on the x axis and the variable on the y axis
correlation coefficient
Perfect Positive correlation
X increases and Y increases at the same rate
Perfect Negative correlation
X increases and Y decreases yet at the same rate
X increases and Y increases =
positive correlation
X increases and Y decreases =
negative correlation
0 value for correlation coefficient means there is
no correlation