Descriptive Statistics Flashcards
what are discrete variables
variables with fixed values (often numbers) such as number of objects or shoe size
what are continuous variables
can take any fractional value within their range eg amount of time, distance
why do we use frequency ranges
its often not sensible to calculate frequencies on the basis of each possible score, there may be a lot of possibilites
can condense data while still retaining a lot of information
what are the three common measures of central tendancy
mean mode median
pros of using the mode
can be used for categorical data (nothing else can)
always gives a real data value
cons of using the mode
sometimes gives multiple values (bimodal distibutions)
varies depending on group size
when do we use mode most
usually just for nominal data
pros of the median
insensitive to outlying data so not skewed
often gives a real data value
cons of the median
ignores a lot of the data as doesnt care about outliers
difficult to calculate for lots of data without a computer
when do we commonly use median
for ordinal data and sometimes for skewed interval or ratio data
1 pro of the mean
uses all the data
cons of the mean
very sensitive to anomalous results
doesnt always give a meaningful value (2.4 children??)
only meaningful for ratio and interval data
name 4 measures of data spread
range
interquartile range
variance
standard deviation
con of the range
very sensitive to outliers - the highest and lowest score are not likely to represent the majority of data
how do you find the interquartile range and semi IQR
quartile is the lowest score needed to include a given quarter of the population
Q1 is the lowest 25%, Q2 is the median, Q3 is 75% of data
IQR is Q3-Q1
semi IQR is (Q3-Q1)/2