Representing and Describing Data Flashcards
univariate data
data with only 1 variable
qualitative data
categorical data
non-numerical data
quantitative data
numerical data
classifications of quantitative data
discrete data
continuous data
discrete data
data that can be counted
able to take only specific values

continuous data
data that can be measured
able to take any values within range

mode
value that occurs most frequently in set of data
median
value that lies in middle when set of data is arranged by size
mean
average
sum of all values divided by number of values in set
outlier
extreme value in set of data that can distort results of statistical processes
able to drastically increase or decrease mean
range
simplest measurement of dispersion of set of data
found by subtracting smallest value from largest value
standard deviation (σx)
measure of amount of variation or dispersion of set of values
variance (σx2)
measure of how far set of values is spread from mean
equivalent to square of standard deviation
interquartile range (IQR)
difference between upper quartile and lower quartile
upper quartile (Q3)
data point at 75th percentile
lower quartile (Q1)
data point at 25th percentile
population
entire group from which one may collect data
sample
small group chosen from population
simple random sampling (SRS)
selection of sample completely at random
systematic sampling
selection of sample from ordered sampling frame
convenience sampling
selection of sample by selecting those who are easy to reach
limitations of convenience sampling
does not include random sample of participants (may lead to biased results)
biased sampling
selection of sample that is not random
quota sampling
selection of sample by setting certain quotas for participants
stratified sampling
selection of sample wherein numbers of certain categories are proportional to their number in population
histogram
approximate representation of distribution of numerical data

box-and-whisker plot
box plot
graphical depiction of groups of numerical data through their quartiles

five-number summary needed to draw a box-and-whisker plot
minimum
lower quartile
median
upper quartile
maximum
outlier in relation to lower quartile
less than Q1 - 1.5 * IQR
outlier in relation to upper quartile
greater than Q3 + 1.5 * IQR
dispersion of values in box-and-whisker plot
25% of values are between minimum and lower quartile
25% of values are between lower quartile and median
25% of values are between median and upper quartile
25% of values are betweet upper quartile and maximum
cumulative frequency
sum of all frequencies up to particular value
requirements for drawing cumulative frequency curve
creation of cumulative frequency table with upper boundary of each class interval in one column and corresponding cumulative frequency in another
requirements for finding percentile
reading value on cumulative frequency curve corresponding to percentile of total frequency
bivariate data
data with 2 variables
purpose of bivariate data
comparison of paired data on 2 variables to determine if there is any correlation between them
positive correlation
correlation between variables wherein independent variable increases alongside dependent variable

negative correlation
correlation between variables wherein independent variable increases while dependent variable decreases

descriptions for strength of correlations
strong
moderate
weak