Week 9 - Presenting Data Flashcards
What are the three data presentation formats?
Text
Tables
Graphs
What is nominal data?
Provides data as counts and/or percentages (table sums to 100%)
Nominal data is often summarised using tables
Numbers depend on sample size. May make interpreting ambiguous
Converting to percentages can make comparisons among cells easier
Nominal data can be summarised graphically using bar graphs (when categories are nominal include gaps between)
Include brief informative axis labels
Label categories meaningfully
Categories not in any particular order
Shape is meaningless - not normally distributed
Can be summarised in text
Stem and leaf display
Displays all the data with full precision
Frequency column allows us to determine how many participants there are
Stems are used to group observations into narrow ranges
Leaves show data for every observation in the data set
Combining a stem and lead gives the value for an observation
Stems can be repeated to divide into smaller ranges
Easy to determine distributional shape, can identify range at a glance, determine median and nodes relatively easily and estimate mean
To work out the median you take the number of scores, plus one and divide it by two.
Can work out the 25th and 75th percentile and the IQR
What are histograms?
Used with ordinal and interval/ratio data
Grouped into intervals (bins)
No gaps between bars
Height of bars Indictate how many values are within each bin
Easy to assess distributional shape (it’s normal)
Easy to detect outliers
Lacks the exactness that stem and leaf plots have
What are box plots
A graphical method for summarising multiple statistics
Median (sometimes mean)
Interquartile range
Max and min plus outliers
If fences are drawn then values more extreme are indicated as individual points and are considered outliers
Useful for assessing distributional shape (if the median isn’t centred then distribution is skewed. Unequal whiskers can also indicate skew)
How to construct a box plot
First draw a line for the median. Determine 25th and 75th percentile lines to show the IQR. Connect these lines then draw the whiskers at the max and min points as long as there’s no outliers.
They conceal detail.
Including error bars improves info available (Standard deviation)
What are error bars
The most common way to visually present data is by plotting a measure of central tendency with an accompanying error bar Error bars represent the following -standard deviation -standard error -confidence interval (usually 95%)
Always show an error with a measure of central tendency
Be sceptical of figures that don’t include error bars
What is one way to mislead audiences using a bar graph?
By omitting the baseline, differences are often exaggerated
What another way to manipulate graphs which lead to misleading information?
By manipulating the y axis.
Third way of misleading audiences with graph?
Cherry picking data (eg only picking certain months to graph)
How does using the wrong graph lead to misinterpreting data.
Using the wrong graph - pie charts are used to show parts of a whole.
How does going against conventions lead to misleading graph?
Because not following conventions can lead people to interpret things as they are generally are represented.