Chapter 12: Descriptive Statistics* Flashcards
Descriptive statistics
statistics that describe and summarize the data collected, which includes measures of central tendency, variability, and covariation
Effect-size
the magnitude of an effect observed, either to the extent to which 2 variables are associated or the size of the difference in scores between groups
Cohen’s d
an effect size estimate that is the standardized mean difference in scores between two groups (expressed in units of standard deviations)
Correlation coefficient
a statistic that describes how strongly two variables are related to one another or the degree to which they covary
When is pearson r used?
when both variables have interval or ratio-scale properties (i.e. continuous variables); when detecting linear relationships, NOT curvilinear (will be r=0 but may have a non-linear relationship)
Restriction of range
when only a subset of a variable’s possible values are sampled or observed, which can lead to misleading null or attenuated correlations
Regression equation
Y= a + bX; an equation that represents a line drawn to best fit a set of data points, allowing one to predict values of one variable based on another variable
Criterion variable
outcome variable that is being predicted in a regression analysis
Predictor variable
variable used to predict changes in the criterion (or outcome) variable in regression analysis
Multiple correlation (R)
a correlation between a combined set of predictor variables and one criterion variable
Multiple regression
extension of the correlation technique that models the extent to which 1 or more predictor variables are related to one criterion variable
Frequency distribution
a representation of how often each score was observed, arranged from lowest to highest score
Outliers
scores that are very different from the rest of the scores in the dataset, also known as extreme scores; larger effect on small sample sizes
Bar graph
a graph using bars to depict frequencies of responses, percentages, or means in 2 or more groups
Pie chart
a circular graph in which frequencies or percentages are represented as slices of a pie
Histogram
type of bar graph used when the variable on the x-axis is continuous, with each bar touching adjacent bars
Mean or arithmetic average
obtained by summing scores then dividing this sum by the number of scores; for interval and ratio scale data; not outlier robust
Normal distribution
distribution of scores for continuous variables, in which majority of the scores cluster around the mean, with fewer scores as they fall further from the mean
Standard deviation (s)
average deviation of scores from the mean; square root of the variance
Frequency polygons
graphs of frequencies for continuous variables, in which the frequency of each score is plotted on the vertical axis and these points are connected by straight lines
Central tendency
a single number or value that attempts to summarize all of the data, describing the typical score or where most of the scores fall
Median
a measure of central tendency defined as the middle score in a distribution that divides the distribution in half (or an average of 2 middle scores); calculated for continuous and ordinal variables; outlier robust
Mode
a measure of central tendency defined as the most frequent score in a distribution of scores; calculated for variables in an interval, ratio, ordinal, or nominal scale
Variability
the amount of dispersion for scores (for continuous variables) around some central value
Variance (s^2)
a measure of the variability of scores about a mean; sum of squared deviations around mean divided by N-1; higher variance = greater variability
Range
Max score - min score
Why use Cohen’s d?
allows us to make direct comparisons when groups have different units of measurement
Coefficient of determination (r^2)
squared correlation coefficient; a measure of shared variance i.e. proportion of variability in y accounted for/predicted by variability in x; 0 if no overlap and 1 if complete overlap
Partial correlation
if adding a third variable changes the correlation coefficient, then the third variable partially explains the relationship between the original two variables
Regression models
a set of theoretically relevant predictors predicting a criterion variable; can look at how 1 or more predictors uniquely predict variability in criterion
What is the most important benefit of regression?
can investigate the role of multiple predictors in independently predicting the criterion