Descriptive Statistics (Lecture 8) Flashcards
What is the aim of descriptive statistics?
To summarise the key features of data to make it understandable for humans, identifying characteristics/patterns.
What are our measures of central tendency?
Mean (x̄)
Median (M)
Mode (Z)
What are our measures of dispersion?
Interquartile Range (IQR)
Variance
Standard Deviation (SD)
What are our measures of association?
Chi-Squared (2)
Correlation (r)
What’s the central tendency?
A single number that aims to represents the ‘typical’
value of a variable (the average), somewhere between the highest and lowest value of the observations.
What’s the mean?
Calculated by summing all values of a variable and dividing by the number of observations.
What’s the Median (M)
The middle value when the values of a variable are arranged in order of smallest-largest.
What’s the Mode (Z)?
The most commonly occurring value (may be more than one mode for a single variable)
What data is the mean used for?
Ordinal and scale data
What data is the Median used for?
Ordinal and scale data
What data is the Mode used for?
Nominal data
What data is the mode useful for?
Categorical data
What two types of visuals are used for illustrating the central tendency?
Bar charts (for categorical data) and histograms (for continuous data)
What type of data are bar charts used for?
Categorical data
What type of data are histograms used for?
Continuous data
What do lower values in skewness indicate?
The median is lower than the mean (positive skew)
What’s a negative skew?
More high values create this.
What do lower values in the measure of central tendency suggest?
It is a better representation of the ‘typical’ value of a variable.
What do range and interquartile range provide in dispersion?
A basic measure, useful
for visualization and identifying outliers.
What are the preferred measures used in further analysis in dispersion?
Variance and standard deviation
What’s the interquartile range (IQR)?
The range of the middle 50% of values (i.e. between the ‘median’ of the upper and lower halves).
What plot is the most useful for the IQR?
Box plots
What is Variance?
The mean of the squared differences between each
value and the mean.
What is Standard Deviation?
Square root of the variance
What does SD represent?
How far, on average we can expect an individual observation to deviate above or below the mean.
What is kurtosis
The visual representation of measures and dispersion
What does a large standard deviation in a graph mean?
Flat distribution
What does a small standard deviation in a graph mean?
Narrow distribution
What does it mean when the SD is small in a graph?
The mean is a better representation of the average value
What do we use descriptive statistics for?
To summarise our sample data and use statistical inference to generalise about population parameters.