Descriptive Statistics Flashcards
State the definition of population
-An entire group of people we are interested in
State the definition of sample
-Subset of our population and is usually presented with n
What are the 3 different types of data?
-Categorical
-Discrete
-Continuous
What is categorical data?
-Nominal or ordinal
-Has two or more categories with no ordering to them e.g. hair colour
-Can be presented as its raw frequency or as percentage frequency e.g. what is your least favourite subject
What is discrete data?
-Ordinal, ratio or interval
-Has a fixed value with logical order e.g. shoe size
-Can be presented as it’s raw frequency or as percentage frequency
-Can also be presented as a cumulative frequency or percentage e.g. how did students score on a test
-If there’s lots of values then use frequency ranges to present this instead
What are the 3 measures of central tendency?
-Mode
-Median
-Mean
Describe the mode
-Most common in data set
-Used for nominal
Describe the median (and state evaluation points)
-Middle score in data set
-(+) insensitive to outliers
-(+) often gives real, meaningful data value
-(+) useful for ordinal data and skewed interval/ratio data
-(-) ignores a lot of data
-(-) difficult to calculate without a computer
-(-) can’t use this with nominal dat
Describe the mean (and state evaluation points)
-Sum of data points, divided by how many there are
-(+) uses all of the data
-(+) most effective for normally distributed datasets
-(-) sensitive to outliers
-(-) values aren’t always meaningful
-(-) only meaningful for ratio and interval data
Describe mean as a measure of spread
-‘Centre based’ measures of spread such as variance and standard deviation
Describe mode as a measure of spread
-No measures of spread
Describe the median as a measure of spread
-‘Distance based’ measures such as range and interquartile range
-Interquartile range is similar to the range but ignores the most extreme values, is the range of scores within the middle 50% of scores (upper quartile - lower quartile)
What is deviance?
-When each score is subtracted from the mean
-Could see a deviance of 0
What is the sum of squared errors?
-Deviance being squared and all deviances are summed
-The more data points there are, means there’s a bigger SS
What is variance?
-An average of our sum of squares
State positives and negatives of using deviance and variance
-(+) Uses all the data
-(+) Forms the basis of several other tests
-(-) Required a normal distribution
-(-) Sensitive to outliers
-(-) Units are not sensible
What is standard deviation?
-Measure of spread that is equal to the unit of measurement of the DV
-Calculated using square root of variance
-EQUATION ON DOCS