U1 One-Variable Statistics Flashcards
continuous variable
can have any value w/ in a given range
discrete variable
can only have certain values (usually whole #s)
simple random sampling
every member of pop has equal chance of being chosen
systematic sampling
selecting members at regular intervals
stratified sampling
surveying members from different subgroups at equal portions
Cluster sampling
- surveying specific related groups to represent pop
- naturally divided
multi-stage sampling
- several lvls of random sampling
- ex. country>province>district>highschool>classes
voluntary-response
open invitation for pop to participate
convenience sample
selected because of access reasons
bias
when Data does not reflect the pop
unintentional bias
introduced in poor methodology
intentional bias
used to manipulate stats for certain POV
sampling bias
does not accurately represent pop
response bias
elicits false or misleading answers
measurement bias
- methodology under/over estimates results
- may contain unnecessary info
non-response bias
under represents groups because they choose to not or cannot respond
measures of spread
quantities illustrate how close a set of data clusters around its center
quartiles
- Q1 is the first half of data (25th percentile)
- Q2 is full set of data (50th percentile)
- Q3 is the second half of data (75th percentile)
what does the box & whisker plot graphs?
graph max and min range, Q1, Q2, Q3
What are the measures of central tendency?
- mean
- median
- mode
deviation
differences between individual data value and the mean
variance
mean of the squares of deviation of each data value
standard deviation
square root of the variance
z-score
measure of the number of standard deviations of a particular data points from the mean
range
difference between the min and max value in a data set
Interquartile Range (IQR) indicates
the larger the number, the larger the spread of the central half of data
precentiles
- measuring the spread of data by separating the data set into 10 equal pieces
- 80th percentile means 80% of data is less than or equal to the value