Types of Data Summary and Data Presentation Flashcards
What is term given to the data generated by an unordered categorical variable?
Nominal data
What is the term given to the data generated by an ordered categorical variable?
Ordinal data
What is a quantitative variable?
Represents a quantity which is counted or measured numerically (may be discrete or continuous)
How do we visualise categorical data?
Bar chart or pie chart
How do we visualise quantitative data?
Histogram or frequency polygon
Key points about histograms
The x (horizontal) axis must be uniform with no breaks, and there are no spaces between the bars.
The y (vertical) axis always begins at zero - this is important because relative comparisons are being made.
The area of each bar represents the frequency (or relative frequency) in each group.
The width of each bar is the size of the interval for each group.
Describe the properties of the mean
The mean is sensitive to outliers; the others are not. For example, a single data point with a very high value may increase the mean by an appreciable amount.
Describe the mode
The mode may be affected by small changes in the data; the others are not. If two data values are almost equally common, slight changes in the data may affect which is the mode.
Describe the mode
he mode and median may be found graphically. For example, the median can be found from a cumulative frequency plot.
Describe the properties of mean, mode and median in symmetric versus skewed data
All three measures of location are equal for a symmetric distribution; in a skewed distribution they differ (see Figure 10).
What is the better measure of location for skewed data?
Median
What is the better measure of location for statistical analysis and inference?
Mean
What is the best measure of location for significantly skewed data?
Transformations or non-parametric techniques
List three ways to summarise the variability of a set of data
Range
Percentiles
Standard deviation
What are the issues with using range as a measure of spread?
Reports extreme values and can only increase with sample size (unlike SD)