Displaying Data Flashcards
1
Q
how to draw a good graph (4)
A
- show the data
- make patterns in the data easy to see (avoid unnecessary clutter)
- represent magnitudes honestly (have a baseline)
- draw graphical elements clearly (appropriate font and text size)
2
Q
frequency and frequency distribution (2)
A
- frequency: number of observations having a particular measurement in a sample
- frequency distribution: number of occurrences for all values in the data
3
Q
relative frequency
A
- proportion of observations having a given measurement, calculated as the frequency divided by the total number of observations
4
Q
relative frequency distribution
A
- proportion/fraction of occurrences of each value in a data set
5
Q
frequency table (2)
A
- text display of the number of occurrences of each category in a data set
- categorical data for one variable
6
Q
bar graph (2)
A
- uses the height of rectangular bars to display frequency distribution (or relative frequency distribution)
- categorical data for one variable
7
Q
how to make a good bar graph (6)
A
- the bars must have equal widths to represent magnitude correctly
- baseline of y-axis is at 0
- bars should stand apart, spaces between bars
- nominal data: order categories based on frequency of occurrence
- ordinal data: present values in natural order
- total # of observations (n) in figure legend
8
Q
bar graph vs pie chart (2)
A
- bar graph is usually better than a pie chart
- more difficult to compare frequencies, supplementary labelling required
9
Q
histogram (3)
A
- uses area of rectangular bars to display the frequency distribution (or relative frequency distribution)
- data values split into consecutive bins/intervals of equal width
- used for single numerical variable
10
Q
mode
A
- interval corresponding to the highest peak in the frequency distribution
11
Q
bimodal
A
- frequency distribution having two distinct peaks
12
Q
symmetric
A
- frequency distribution having frequencies on the left half of the histogram mirror the frequencies on the right half
13
Q
skewed
A
- frequency distribution that is not symmetric for a numerical value
14
Q
uniform
A
- frequency distribution having level frequency distribution (all frequencies are around the same range)
15
Q
outliers
A
- observation well outside of the range of values of other observations in a data set
16
Q
how to draw a good histogram (6)
A
- each bar must rise from baseline of 0
- no spaces between each bar
- “left closed” intervals: value 70 falls into the interval 70-72 rather than 68-70
- number of intervals should best show patterns and exceptions in the data
- use readable numbers for breakpoints (0.5 rather than 0.486)
- include total number of individuals in legend
17
Q
contingency table (2)
A
- used for multiple associated categorical variables
- gives the frequency of occurrence of all combinations of 2+ categorial variables
18
Q
grouped bar graph (3)
A
- uses height of rectangular bars to display frequency distributions of 2+ categorical variables
- different categories of response variable are indicated by different colours
- bars are grouped by category of the explanatory variable treatment
19
Q
mosaic plot (3)
A
- area of rectangles to display relative frequency occurrence of all combinations of 2 categorical values
- bar area and height indicate the relative frequencies of the responses
- width of each vertical stack is proportional to the number of observations in that group
19
Q
mosaic plot (3)
A
- area of rectangles to display relative frequency occurrence of all combinations of 2 categorical values
- bar area and height indicate the relative frequencies of the responses
- width of each vertical stack is proportional to the number of observations in that group
20
Q
scatter plot (3)
A
- graphical display of two numerical values where each observation is represented as a point on a graph with two axes
- position on x-axis indicates measurement of explanatory variable
- position on y-axis indicates measurement of response variable
21
Q
positive association
A
- points tend to run from lower left to upper right
22
Q
negative association
A
- points tend to run from upper left to lower right
23
Q
absent association
A
- no discernible pattern in points
24
strip chart
- graphical display of a numerical variable and a categorical variable in which each observation is represented as a dot
25
violin plot
- graph that shows approximation of frequency distribution of a numerical variable in each group and its mirror image, association between numerical and categorical
25
violin plot
- graph that shows approximation of frequency distribution of a numerical variable in each group and its mirror image, association between numerical and categorical
26
line graph (2)
- uses dots connected by line segments to display trends over time in a summary measurement, such as mean, or other ordered series
- steepness of line segment reflects speed of change between values
27
map
- spatial equivalent of the line graph, using colour gradient to display a numerical response variable at multiple locations on a surface
- explanatory variable: location in space
28
how to make a good table (3)
- make patterns in the data easy to see (avoid clutter and arrange values to facilitate pattern detection)
- represent magnitudes honestly (intervals of equal width)
- draw table elements clearly (labels, units)
29
what graph do you use for categorical data?
- bar graph
30
what graph do you use for numerical data?
- histogram
31
what graph do you use for multiple numerical values? (2)
- scatter plot
| - line graph
32
what graph do you use for multiple categorical variables? (3)
- grouped bar graph
- mosaic plot
- contingency table
33
what graph do you use for one numerical variable and one categorical variable? (4)
- multiple histograms
- cumulative frequency diagrams
- violin plot/box plot (categorical explanatory, numerical response)
- strip chart (categorical explanatory, numerical response)