Lecture 4 Flashcards
discrete = ?
finite set of values a variable can take on
how can data be described?
frequency tables
pie charts
bar charts
measures of central tendency encompasses…?
mean, median, mode
continuous variable = ?
can have infinite values
can take on any value within an interval
(e.g., any number between 0 and infinity)
levels of measurement for continuous = ?
interval or ratio
distribution = ?
collection of values of a particular variable
bin = ?
a rectangle in a histogram
what’s the difference between discrete & continuous distribution graphically?
continuous is a smooth curve, no gaps
discrete has noticeable gaps in between values
are discrete/continuous variables measured or counted?
discrete = counted
continuous = measured
frequency distribution = ?
a summary of a dataset, showing the frequency of items in several classes
objective is to provide insight
frequency distribution for qualitative data = ?
counting the number of times each value occurs
frequency distribution for quantitative data = ?
either counting or grouping values
symmetric frequency distribution = ?
in the case that a distribution is split into two identical halves
skewness frequency distribution = ?
assymetric distribution
kurtosis in a frequency distribution = ?
degree of peakedness or steepness in a distrubution
positively skewed = ?
hump is on the left side
negatively skewed = ?
hump is on the right side
steeply peaked = ?
sharp, high, middle curve
shape of the distribution influences…
all statistical descriptive measures
when a distribution is symmetrical…
the mean & median values are the same
when a distribution is skewed…
the equivalence disappears
what is more representative of a dataset in the case of a skewed distribution: mean or median?
median, not the mean
why is the median more representative in an assymetrical distribution?
outliers don’t skew median results, but they’d skew mean averages
mean = ?
AKA average or expected value
harmonic mean = ?
an average which is useful for sets of numbers which are defined in relation to some unit
geometric mean = ?
indicates the central tendency or typical value
arithmetic mean = ?
sum of all numbers divided by the number of numbers
median = ?
the value that separates a set of values into two perfectly equal halves
the middle value in an ordered list of data
mode = ?
the most commonly occurring value in a dataset
bimodal = ?
dataset with two modes
will have two lumps in a line graph
density curve = ?
an idealised description of a data distribution
measures of variability = ?
helps communicate the shape & spread of the dataset
the dispersion of the variables in a dataset
e.g., variance, SD, quartile
variance = ?
measure of how far a set of numbers is spread out from their average value
standard deviation = ?
measure of the amount of variation or dispersion of the values in a dataset
approximately the average distance between all individual values in a dataset and its centre
how do you calculate standard deviation?
square root of variance
range = ?
difference between the smallest and largest values
quartiles = ?
specific percentiles dividing the data into 4 parts
interquartile range = ?
the difference between the third and the first quartile
why is standard deviation preferred over variance?
an advantage of the standard deviation over the variance is that its units are the same as those of the measurement