Chapter 12 - Descriptive Statistics Flashcards
Depict four graphs of data representation
- Bar Graphs
- Pie Charts
- Histograms
- Frequency Polygons
Differentiate between descriptive and inferential statistics
Descriptive
*procedures for depicting the main aspects of sample data, without necessarily inferring to a larger population.
- Measures of Central Tendency
* Mean
* Median
* Mode
- Measures of Variation
* Range
* Standard Deviation
Inferential
*a broad class of statistical techniques that allow inferences about characteristics of a population to be drawn from a sample of data from that population while controlling (at least partially) the extent to which errors of inference may be made
*Z-test
*t-test
*ANOVA
*Chi-Square
*(and many more!)
Define the normal distribution
A theoretical distribution in which values pile up in the center at the mean and fall off into tails at either end.
When plotted, it gives the familiar bell-shaped curve expected when variation about the mean value is random.
Many statistical models are based on the assumption that data follow a normal distribution.
The normal distribution has several primary characteristics: It is symmetrical, it has both upper and lower asymptotes, and its mean, median, and mode are the same value.
Perhaps most important, however, fixed proportions of values fall within defined sections of the distribution.
For example, 34.13% of values fall between the mean and one standard deviation above the mean, and a corresponding 34.13% of values fall between the mean of the distribution and one standard deviation below the mean.
Calculate three measures of central tendency
Central Tendency: the middle or center point of a set of scores.
- Mean (M): The numerical average of a set of scores, computed as the sum of all scores divided by the number of scores
- Median (Mdn): the midpoint in a distribution, that is, the score or value that divides it into two equal-sized halves.
- Mode (Mo): the most frequently occurring score in a set of data.
Calculate three measures of variation
Variability: the amount of dispersion for scores around some central value.
Cohen’s d
A measure of effect size based on the standardized difference between two means: It indicates the number of standard deviation units by which the means of two data sets differ.
Effect Size Interpretation:
0.20 = small
0.50 = medium
0.80 = large
Describe, calculate, and interpret a correlation coefficient
A numerical index reflecting the degree of linear relationship between two variables
It is scaled so that the value of +1 indicates a perfect positive relationship, -1 indicates a perfect negative relationship, and 0 indicates no relationship
Equation in doc
0 ≤ |r| < .30 = small
.30 ≤ |r| < .50 = medium
.50 ≤ |r| ≤ 1.00 = large
Describe and interpret a regression equation
Regression: statistical techniques that are used to describe, explain, or predict (or all three) the variance of an outcome or dependent variable using scores on one or more predictor or independent variables
Regression Equation: the mathematical expression of the relationship between a dependent (outcome or response) variable and one or more independent (predictor) variables that results from conducting a regression analysis.
It often takes the form y = a + bx + e, in which y is the dependent variable, x is the independent variable, a is the intercept, b is the regression coefficient, and e is the error term.
Coefficient of Determination (r2)
A numerical index that reflects the proportion of variation in an outcome or response variable that is accounted for by its relationship with a predictor variable.
It is a measure of the percentage of variance in a dependent variable that is accounted for by its linear relationship with a single independent variable.
Obtained by multiplying the value of the correlation coefficient (r) by itself (doc)
Contrast mediating and moderating variables
Mediator: an intermediary or intervening variable that accounts for an observed relation between two other variables.
X → Z → Y
Moderator: an independent variable that changes the nature of the relationship between other variables.
X → Y
|
Z
State the function of a partial correlation
The association between two variables, x and y, with the influence of one or more other variables statistically removed, controlled, or held constant; the effect if the z variable is removed from both x and y.
It is often of interest to learn whether a correlation is significantly reduced in magnitude once a third variable is removed.
Describe and interpret a multiple correlation and multiple regression
Multiple Correlation: a numerical index of the degree of relationship between a particular variable and two or more other variables.
Coefficient of Multiple Determination: a numerical index that reflects the degree to which variation in a response or outcome variable is accounted for by its relationship with two or more predictor variables.
Provide four potential issues with describing data
- Skewness (positive and negative skew)
- Multimodal Distribution
- Range Restriction
- Non-Linear Relationships