Lecture 4 REVISED Flashcards
continuous variable
can take on any value in an interval
e.g., worker’s hourly income can take on any value between 0 and infinity
discrete variable
can only take on set, distinct values within in interval
e.g., how many people chose blue as their favourite colour can only be whole number values
what levels of measurement are required for continuous variables?
interval or ratio
rectangle in a histogram is called a…
bin
how does a discrete/continuous data distribution look on a graph?
discrete: bars
continuous: curve
frequency distribution is…
a tabular summary of a dataset showing the frequency of items in each class
symmetric, skewness, kurtosis in frequency distributions?
symmetric: distribution is split into two identical halves
skewness: level of asymmetry in which an elongated tail extends
kurtosis: degree of peakedness/steepness in a distribution
when a distribution is perfectly symmetrical, what is the relationship between the mean and median?
mean and median are the same values
when a distribution is skewed, this isn’t the case
why does the median tend to be more representative than the mean?
because if a distribution isn’t symmetrical, an outlier may skew the mean/average
where is the mode in a frequency distribution?
the peak
what formula is used to find the position of the median value?
(n+1) / 2
what is the formula to calculate standard deviation?
- subtract the mean from each value
- square all the deviations and add them together
- divide this by (n-1)
- square root this figure
what does standard deviation tell us about the dataset?
how close each value is from the mean
small standard deviation = low amount of variability, values are close to the mean
high standard deviation = high variability, values are far from the mean
variance relationship with standard deviation?
standard deviation is the square root of the variance
density curve
an idealised description of a data distribution
describes the overall pattern of a distribution