Visualizing Data Flashcards

1
Q

what are two purposes of graphs

A
  • analyze data
  • communicate/present data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are four ways to draw a BAD graph

A
  • have the graph hide data (not showing all data points)
  • have patterns hard to see
    (having a 3d graph skews the data and makes it hard to read)
  • magnitudes are distorted
    (not having y-axis starting at zero)
  • graphical elements are not clear
    (text/figure elements are too SMALL to read)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what makes a good graph

A
  • showing the data
    (showing individual data points which allows the data shape to be displayed and patterns easier to see)
  • make patterns in the data easy to see
    (the right graph for the data will allow the main pattern to be seen right away)
  • represent the magnitude honestly
    (always start at zero for baselines)
  • draw graphical elements clearly
    (including labeled axes, units, graphical symbols for more than one data set…)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what must the graph axes always start at

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

why should graph axes start at zero

A

makes the graph honest as the reader will now compare the data to zero as the baseline not some random data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

should graphs be 3d? why or why not?

A

NO - obscures the pattern in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how much data should be put in the graph

A

just enough to get the point across without overflooding the graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what graphs should be used for showing categorical data

A

frequency table and bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

frequency table

A

text display of the number of occurrences of each category in the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

bar graph

A

uses the height of rectangular bars to visualize the frequency of occurrences of each category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

does a bar graph show exact numbers of the data

A

NO but it does give a picture of how steeply the numbers change between categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what makes a good bar graph

A
  • baseline of y axis is zero
  • bars are equal width
  • nomial data is organized by frequency of occurrence (greatest to least)
  • bars are not fused together
  • total number of observations should be recorded in figure legend
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

histogram

A

uses area of rectangular bars to display frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is a histogram used for

A

showing data of a single numerical variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does a peak in a histogram refer to

A

an interval of the frequency distribution that is noticeably more frequent than surrounding intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is an example of a histogram with a peak?

A

bell shaped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what does bimodal refer to with histograms

A

frequency distribution having TWO distinct peaks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q
A

Histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q
A

bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q
A

frequency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what does skew refer to in a histogram

A

when the frequency distribution is NOT symmetrical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what types of skew can a histogram have

A

negative and positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is an outlier

A

extreme data points lying well away from the rest of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

how can outliers occur in data

A
  • Mistakes in recording the data
  • Real phenomenon that CANNOT be dropped from the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what makes a good histogram

A
  • bars rise from zero
  • bars are contiguous and NOT spaced out
  • using readable numbers for the break point between data intervals (using 5 and not 4.998)
  • total number of individuals in the legend
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what graphs should be used for showing ASSOCIATIONS between categorical variables

A

contingency table

mosaic plot

grouped bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

contingency table

A

a frequency table for two or more categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

why is a contingency table used

A

to show how frequencies of the categories in a response variable are contingent upon the value of the exploratory variable

29
Q

what does a cell refer to in a contingency table

A

one combination of categories of the row and column variables in the table

30
Q

what variable goes in the columns of a contingency table

A

explanatory variable

31
Q

what variable goes in the row of a contingency table

A

response variable

32
Q
A

contingency table

33
Q

how does a mosaic plot differ from a grouped bar graph

A

the bars within treatment groups are stacked on top of one another

34
Q

how to read a mosaic plot

A

the bar area and height relate to relative frequencies of the responses

35
Q

how to read whether there are associations between treatment and response variables in a MOSAIC PLOT

A

Yes association: vertical position where the colours meet will differ between stacks

No association: the meeting point between colors will be at the same vertical position between stacks

36
Q

do mosaic plots show absolute or relative frequencies in each combination of variables

A

RELATIVE

37
Q
A

Mosaic plot

38
Q

grouped bar graph

A

uses the heights of rectangles to graph the frequency of occurrences of all combinations of two or more categorical variables

39
Q
A

grouped bar graph

39
Q

how are bars groped in a grouped bar graph

A

by the categories of explanatory variables

40
Q

what graph is used to show associations between numerical variables

A

scatter plot

41
Q

what variable is found each axis in a scatter plot

A

x - explanatory variable

y - response variable

42
Q

what are possible associations in a scatter plot

A

positive (graph runs lower left to upper right)

Negative (points run from upper left to lower right

Absent (no easily seen pattern)

43
Q
A

scatter plot

44
Q

what graphs are used to show associations between numerical and categorical variables

A

strip chart
violin plot
multi-histogram method

45
Q

strip chart

A

where each observation is represented as a dot on the graph

46
Q

how are the axes of a strip chart labeled

A

x axis - categorical measurements

y axis - numerical measurements

47
Q

how does a strip chart differ from a scatter plot

A

by the explanatory variable being categorical and NOT numerical

48
Q

when is a strip chart ideal

A

when there are only A FEW observations in each category to reduce overcrowding of data

49
Q
A

strip chart

50
Q

violin plot

A

displays data using compact visual symmetry

51
Q

how is a violin plot similar and different from a histogram

A

like: approximates the frequency of each group

differ: distribution is smoothed and shown with mirror image

52
Q

what does the dot in the center of each violin mean in a violin plot

A

mean of the data

53
Q

when is a violin plot ideally used

A

when the goal of the graph is to show the most important features of the frequency distribution = used for large number of observations

54
Q
A

violin plot

55
Q

how should multi-histograms be positioned

A

stacked on top of each other so the spread of data between them is easier to compare

56
Q

when does the multi-histogram method best for

A

only when there is a FEW categories as they can take up lots of room

57
Q
A

stacked histogram method

58
Q

what graphs are used to show trends in time and space

A

line graph and map

59
Q

line graph

A

displays trends over time by using dots connected by line segments in a summary measurement (mean)

60
Q

what do the lines show in a line graph

A

connecting two points together shows the temporal pattern

61
Q
A

line graph

62
Q

what is the spatial equivalence of a line graph

A

map

63
Q

map

A

graph that uses colour gradients to display numerical response variables at different locations

64
Q
A

map

65
Q

two types of tables for displaying data

A

display table and data tables

66
Q

display tables

A

numerical detail is less important than the effective communication of results

67
Q

data tables

A

purpose is to store raw data for reference purposes NOT for communicating general findings