Chapter 12 - Statistical Analysis of Quantitative Data Flashcards
Descriptive Statistics
Used to synthesize and describe data
Parameters
When indexes (averages, percentages) are calculated with data from a population
Statistic
Descriptive index from a sample
Inferential Statistics
Used to help make inferences about the population
Frequency distribution
Systematic arrangement of values from lowest to highest, together with a count or percentage of how many times the data occurred
- easy to see highest and lowest scores, most common scores, where data clusters, and how many patients were in the sample
- can be displayed in a “frequency polygon” where scores are graphed on horizontal line and frequency on vertical line
Symmetric Distribution
occurs if, when folded over, the two halves of a frequency polygons would line up
Positive Skew
when the longer tail points to the right
–>ex. personal income
Negative Skew
when the longer tail points to the left
–>ex. age of death
Unimodal vs. Multi-modal
one peak vs. multiple peaks
Normal distribution
“bell-shaped curve”
- symmetrical
- unimodal
- not very peaked
–>ex. height, intelligence
Central Tendency
measures of “typicalness”
- mode
- median
- mean
Mode
number that occurs most frequently in the distribution
Median
the point in a distribution that dived the scores in half, the middle point
-preferred when data is highly skewed
Mean
the sum of all values divided by the number of participants
“average”
-most stable index
Variability
how two distributions with identical means could differ in shape and how spread out the data is
Range
highest score minus the lowest score in a distribution
- easy to compute
- unstable
- “gross descriptive index”
Standard Deviation
summarizes the AVERAGE amount of deviation of values from the mean
- most widely used variability index
- calculated based on every value in the distribution
- 3 SDs above and below the mean in a normal/near normal distribution
- lower SD = more homogeneous
+/- 1 SD: 68% of data
+/- 2 SD: 95% of data
+/- 3 SD: 99.7% of data
Crosstabs (contingency) table
Two-dimentional frequency distribution in which the frequencies of two variables are crosstabulated
–>ex. differentiating between men and women in categories of non-smoker, light smoker, and heavy smoker
Correlation
To what extent are two variables related to each other?
–>ex. anxiety scores and BP
Correlation Coefficient
calculation that describes intensity and direction of a relationship
-how “perfect” a relationship is (ex. tallest person also weighs the most)
Positive Correlation
when an increase in one variable lead to an increase in the other (.01 to 1.00)
Negative (Inverse) Correlation
when a decrease in one variable leads to an increase in the other (-.01 to -1.00)
–>ex. depression and self-esteem
Pearson’s r
product-moment correlation coefficient
computed with interval or ratio measurements
-no clear guidelines for interpretation
- descriptive: summarizes the magnitude an direction of a relationship between two variables
- inferential: tests hypotheses about population correlations
Correlation Matrix
variables are displayed in both rows and columns
Absolute Risk
the proportion of people who experienced an undesirable outcome in each group
Absolute Risk Reduction Index
comparison of the two risks
- ->computed by subtracting the absolute risk for the exposed group from the absolute risk for the unexposed group
- *it is the proportion of people who would be spared the undesirable outcome through exposure to an intervention/protective factor