Chapter 12 - Descriptive Statistics Flashcards
1
Q
purpose of descriptive statistics
A
- Summarizes mass of data points (understanding and interpretation, visual displays, appropriate calculations)
- In experiments can calculate within each cell (means, standard deviations, etc.)
- In correlational designs for each variable, calculate mean, standard deviations, etc. (correlation coefficient is a descriptive statistic)
2
Q
3 measures of central tendency
A
- median
- mode
- mean
3
Q
median
A
- Midpoint; score that divides group in half (50% of scores lie below this point, 50% lie above)
- Put scores in order from smallest to largest, then count number of scores. If it’s an odd number, identify the middle score. If it’s an even number, take the two middle scores and take the average of them
4
Q
mode
A
- Most frequently occurring score
- Sometimes no mode; sometimes more than one
- Put scores in order or create a frequency distribution (ie. Bell curve); identify the score that occurs most frequently
5
Q
mean
A
- Arithmetic average
- What we use (and most frequently used); uses information from every score
- Add up all scores, divide by total number of scores
- Cons: affected by outliers (ie. Extreme scores)
- Pros: with increasing sample size, each extreme score has less effects on the mean; maximizes use of data points; has mathematical properties that allow us to do analyses
6
Q
What happens when you have outlying values?
A
- Whenever you have outlying values, this makes the mean inaccurate and not reflective of the actual sample
- If a mode is readily observable, it may be a more appropriate representation
- If no mode is readily available, you can go with the median
- the bigger your sample size, the smaller effect outliers have -> check your data for outliers, but the first step should always be to get as large a sample size as possible
7
Q
variability
A
- the spread of distribution of scores
- measures: range, variance, standard deviation
8
Q
range
A
maximum value minus minimum value
9
Q
variance
A
- s^2
- Sum of squared deviations around mean divided by N-1
- Need it for later analyses
10
Q
standard deviation
A
- s or SD -> root s^2 (square root of variance)
- On average, the deviations of each score from the mean
- On average, people deviate from the mean by 1 standard deviation
- We consider outliers to be 3+ standard deviations away from the mean
- Most of the scores (68%) should fall within 1 standard deviation of the mean, 96% fall within 2 SD’s of the mean, almost 100% fall within 3 SD’s of the mean
- 34% -> 14% -> 2% -> 0%
11
Q
correlation coefficient
A
- aka bivariate correlation
- Numerical index that reflects the degree of linear relationship between two variables
- Calculated using Pearson r
- A zero doesn’t necessarily mean no relationship – could be masking a curvilinear relationship
- Ex. Delay of gratification study (Marshmallow test)
12
Q
coefficient of determination
A
- r^2
- How much variability is shared between variables?
- % of variability in y accounted for by variability in x
- % of variability in y predictable by variability in x
- If r^2 = 0, there’s no overlap -> no variance -> changes in one won’t predict changes in the other
- If r^2 = 1, there’s complete overlap (but this doesn’t happen)
13
Q
regression
A
- Use score on one variable (“Predictor”) to PREDICT changes in another variable (“Criterion”)
- Can look at how MULTIPLE predictors can predict variability
- Extension of correlation: Both measure relationships among variables; neither implies causation
- Different terms, not just “variable 1” and “variable 2”
- Regression line: Y = a + bX (a is y-intercept, b is slope… rise over run, X is the known score)
14
Q
partial correlation
A
A correlation between X and Y that identifies and statistically removes the effect of 3rd variable
15
Q
multiple correlation
A
- R
- how much a combined set of predictors is related to the criterion