descriptive statistics Flashcards
what is population
an entire group of people we are interested in
what is a sample
a subset of our population and is usually represented with n
what is categorical data
- 2 or more categories
- nominal or ordinal
what is discrete data
has a fixed value with a logical order
usually ordinal, ratio or interval
what is continuous data
- can take any fractional value
- usually ration or interval
how can we present frequency in categorical data
- raw frequency
- bar graphs
how can we present frequency in discrete data
- raw frequency or percentage - bar graph
- cumulative frequency
when would we use measures of central tendency
- when we want to condense the entire frequency distribution to a single number
what is the mode
- appears most often
- nominal data
- can be more than one value
what is the median
- the middle of the dataset
what is the mean
- the sum of data points
what are pros of using median
- insensitive to outliers
- gives a real, meaningful data value
- useful for ordinal data and skewed interval/ratio data
what are cons of using median
- ignores a lot of data
- difficult to calculate without a computer
- can’t use with nominal data
what are the pros of using the mean
- uses all the data
- most effective for normally distributed data
what are the cons of using mean
- sensitive to outliers
- values are not always meaningful
- only meaningful for ratio and interval data
what measures of spread are used in median
- range
- interquartile range
what measures of spread are used in mean
- variance
- standard deviation
what is interquartile range
- similar to range but ignores most extreme values
- is the range of scores within the middle 50% of scores
- lower quartile - median of lower half of the data
- upper quartile - median of upper half of the data
- interquartile range - upper quartile-lower quartile
what is deviance
- each score is subtracted from the mean
- could see a deviance of 0
what is sum of squared errors
- deviance is squared and all deviance are summed
- more data point = a bigger SS
what is variance
an average of our sum of squares
what are pros of variance
- uses all the data
- forms the basis of several other tests
what are cons of variance
- requires a normal distribution
- sensitive to outliers
- units are not sensible
what is standard deviation
- a measure of spread that is equal to the unit of measurement of the dependent variable
- calculated using the square root of the variance