Chapter 12 and 13 Flashcards
Types of statistics
Descriptive and inferential
Define both terms of statistics
Descriptive statistics: used to describe the characteristics of a sample or population.
Ex: class average
Inferential statistics: used to infer (estimate) population parameters (value within a population) from a subgroup (sample) of the population.
Technical assumptions and the two parametric
Parametric statistics: built-in assumptions about the data distribution that must be met if the statistic is to be used
Non-parametric statistics: No built-in assumptions.
For ex, you could assume a normal distribution of the underlying populations.
Raw vs relative frequency
Raw: The results may indicate the actual number of cases
Relative: that take on each value or expressed as a percentage of the cases that take on each value
What are the measures of central tendency?
group of statistics that present a single value that best represents the distribution of response
Mean, mode, median
Measure of dispersion
group of statistics that indicate how well the measure of central tendency represents the distribution
Variation ratio, Range and Standard deviation
Measures of central tendency and dispersion of nominal variables
Mode: measure of central tendency used with nominal variables
Most frequent
Variation ratio: proportion of cases that do not fit within the modal category
Larger values indicate more variation, meaning the mode does not represent the distribution well
Smaller values indicate less variation, indicating the mode does a good job of representing the distribution
Measures of central tendency and dispersion of ordinal variables
Median: the most appropriate measure of central tendency. Value of observation that splits the distribution of cases in half
Range: the measure of dispersion used with ordinal-level variables. The range of possible values that the variable encompasses. Ignores all information except for the two most extreme scores
Interquartile range is more commonly used. The range between the 25th and 75th percentile. Not influenced by outliers
what is an outlier?
Outlier: a case that differs significantly from the others
Measures of central tendency and dispersion for interval/ratio variables
Arithmetic mean: calculated by adding all of the values and then dividing by the total number of cases
The median is a better measure because it is not influenced by extreme cases
Standard deviation: estimates the average amount that each observation differs from the mean.
Positive vs negative skew
pulling it in the direction of extreme scores
positively skewed: extreme scores pull the mean above the median
Negatively skewed: extreme scores pull the mean below the median
The greater the difference between the mean and median…
the more skewed the distribution is.
what is the standard deviation?
Standard deviation: estimates the average amount that each observation differs from the mean.
Characteristics of standard deviation
The size of the standard deviation depends on how clustered the scores are around the mean
Smaller deviation if the scores are closer to the mean
The values of a standard deviation are always positive
If all scores are identical there would be no deviation. Meaning it is equal to zero.
standardized scores
scores expressed as the number of standard deviations that fall from the mean of the total distribution scores.
Standardized scores can be positive or negative depending on whether they fall above or below the mean
contingency tables
Contingency tables: when working with ordinal or nominal variables, the cell in which the individual case is located is contingent upon its scores for each of the variables.
scatter plots
when working with interval/ratio variables. Graphs in which the point of an individual case lies are contingent upon its scores for each of the variable
what is a perfect correlation?
when knowing the value of one variable always allows us to predict the value of the other.
Measures of association
indicate the strength of the relationship with a single numerical value
What is the range of measures of association for each type of variable?
Nominal: 0 to 1
The closer the coefficient is to 0, the weaker the relationship
Ordinal and interval/ratio: -1 to +1
0 means a weaker relationship, while closer to +1 or - 1, means a stronger relationship