Descriptive Stats Flashcards
what is a univariate stat?
observed distribution of cases for a single variable at a time; illustrate # of times values of a variable are observed in a sample
what is the calculation for percentage in a grouped frequency table?
(responses) / (total responses)
what is the calculation for a valid percentage in a grouped frequency table?
(responses)/(total responses - invalid responses)
what is the calculation for a cumulative percentage in a grouped frequency table?
add valid percentages as you go down the table
who is Florence Nightingale?
mother of descriptive stats and inventor of pie charts
which variables are best for bar charts?
nominal/ordinal (discrete)
which variables are best for pie charts?
discrete variables
which variables are best for histograms?
discrete variables that are technically continuous
which variables are best for line charts?
continuous variables
what are 3 ways people lie in stats using graphs/charts?
- low density charts
- make aesthetics top priority
- poorly organized categories
what are the qualities of nominal measurements?
qualitative, can’t be ranked, mutually exclusive, used as categories
what are the qualities of ordinal measurements?
can be either qualitative/quantitative, can be ranked/ordered, distance between data points isn’t meaningful
what are the qualities of interval measurements?
qualitative, distance between data points is meaningful, no “true” zero, can (+)/(-) values
what are the qualities of ratio measurements?
qualitative, distance between data points is meaningful, “true” zero, can (+)/(-) and (x)/(divide) values
what are central tendencies?
avg. or typical values of a distribution
what is a mean and which variables is it best for?
an average value, best for interval and ratio variables
what are pros and cons of means?
pro = simplifies presentation of data
con = vulnerable to outliers
what is a median and which variables is it best for?
value of middle cases, good for all variables except nominal but best for ordinal
what are pros and cons of medians?
pro = not affected by outliers
con = doesn’t take into account value of all cases in a distribution
what is a mode and which variables is it best for?
most frequent variable, can be used for all variables but best for interval and ratio
what is dispersion?
ways values are distributed around some central value (usually the mean)
what is a range and which measures is it used for ?
distance between the highest and lowest value, used in ordinal levels and up
what are the pros and cons of range?
pro = get initial sense of dispersion
cons = vulnerable to outliers and less statistically sophisticated
what is a variance / standard deviation and which measures is it used for?
avg. amount of deviation from mean value, used for interval and ratio measures