Chapter 2: Descriptive Statistics and Analytics-- Tabular and Graphical Methods Flashcards
T or F: visualizing our data helps us to understand data easily, and see trends/outliers.
True
T or F: With qualitative data, we can count this data and put in different categories. This data can be summarized using a frequency distribution.
True
a table that summarizes the number of items in each of several non-overlapping classes; is a useful summary that makes our data easier to understand
frequency distribution (look at pic in camera roll of frequency distribution)
________ frequency summarizes the proportion of items in each class.
relative
What is the formula to calculate relative frequency?
Frequency / total (or n– total # of observations)
What is the formula to calculate percent frequency?
Relative frequency x 100
What is the formula to calculate and assign degree of pie slice for a pie chart?
Relative frequency x 360
The sum of RELATIVE frequency is always equal to ______.
1
The sum of PERCENT frequency is always equal to _______.
100%
From frequency data, we can make what two kinds of charts?
Bar charts and pie charts
T or F: Bar charts/graphs must have spaces between bars to indicate that each class is separate from each other.
True
a vertical or horizontal rectangle represents the frequency for each category; height can be frequency, relative frequency, or percent frequency
bar chart
a circle divided into slices where the size of each slice represents its relative frequency or percent frequency
pie chart
a bar chart having the different kinds of defects listed on the horizontal scale; tells us how many errors/defects we have in our data
Pareto chart (look at camera roll for pic of pareto chart)
T or F: A pareto chart is a chart that can be made from quantitative data.
False; QUALITATIVE data
In a pareto chart:
- bar height represents the ______ of occurrence
- bars are arranged in (increasing/decreasing) height from left to right
- sometimes augmented by plotting a _________ percentage point for each bar
- frequency
- decreasing
- cumulative
A dot plot is used for (quantitative/qualitative data).
quantitative
A histogram can be made for (qualitative/quantitative) data.
quantitative
a graphical display of a frequency distribution, relative frequency distribution, or percentage frequency distribution. It divides measurements into classes and graphs the frequency, relative frequency, or percentage frequency for each class.
histogram
What are the 5 steps to build a histogram?
- Find the number of classes (formula= 2^k > or equal to n); (n= total number of observations, k= number of classes, which tells us the number of bars we need to build in histogram)
- Find the class length (length of our bars)
- Form non-overlapping classes of equal width (take lowest number and keep adding class length)
- Tally and count (frequency)
- Graph the histogram
What is the formula for finding the number of classes for building a histogram?
Formula = 2^k > (or equal to) n
n = total # of observations
k = # of classes (tells us the number of bars we need to build in histogram)
(look over example 2.2 in notes for histogram example notes)
T or F: Anything to the power of 0 is 1.
True
The number of classes (or bars in a histogram) is represented by the letter ____.
K
What is the formula for calculating class length (length of our bars in a histogram)?
Class Length = (largest value in our data - smallest value in our data) / classes (or k)
T or F: For class length, you always round down.
False; always round UP
T or F: With a histogram, there is space between the bars.
False; NO SPACE between bars (bars on a histogram touch to represent continuous data)
T or F: The base (x-axis) of a histogram represents the class length.
True
T or F: In a histogram, the height represents: a.) the frequency in a frequency histogram, or b.) the relative frequency in a relative frequency histogram.
True
With histograms, if you are making a PERCENT frequency histogram, make sure to take the ______ of the class length for the base label.
average
(ex: if class length is 3, and smallest number in data is 10, you would do (10+13)/2 = 11.5… 11.5 would be your base label for the first bar on the histogram)
T or F: The frequencies on a bar graph represent the counts from categories (qualitative data) while the frequencies on a histogram represent the counts of the quantitative data values grouped into the “classes”.
True
the right tail of the histogram is longer than the left tail
right-skewed
the left tail of the histogram is longer than the right tail
left-skewed
the right and left tails of the histogram appear to be mirror images of each other
symmetrical
cumulative distributions are also called ________ _______.
running total
Another way to summarize a distribution is to construct a cumulative distribution. To do this, use the same number of _______, class lengths, and class boundaries used for the ________ _______. Rather than a count, we record the number of measurements that are ______ ______ the upper boundary of that class.
classes; frequency distribution; less than
(look over picture of cumulative distribution/frequency on camera roll)
a graph of a cumulative distribution
ogive
(plot a point above each upper class boundary at a height of the cumulative frequency; connect the points with line segments; can also be drawn using cumulative relative frequencies and cumulative percent frequencies)
Which type of distribution shape does this describe:
two humps, the left of which may or may not look like the right one, nor is each hump required to be symmetrical
double peaked
Which type of quantitative graph is useful in detecting outliers and easy to read?
Dot plots
(the horizontal axis spans the range of observations or measurements, and dots represent the observations)
a graphical portrayal of a data set that shows the data set’s distribution by using stems consisting of leading digits and leaves consisting of trailing digits
stem-and-leaf display (for quantitative data)
(look in camera roll for pic of this)
T or F: Stem-and-leaf display shows the shape of distribution and shows the value of individual measurements.
True
Stem-and-leaf displays are best used for what size data distributions?
Small to moderately sized data distributions
What is the purpose of a stem-and-leaf display?
to see the overall pattern of the data, by grouping the data into classes
(the variation from class to class, the amount of data in each class, the distribution of the data within each class)
What are 3 advantages of stem-and-leaf displays?
- Displays all the individual measurements and puts data into numerical order.
- Simple to construct.
- Tells us from where the data is starting and where it’s ending (the min. and max. points)
T or F: For stem-and-leaf displays, there are no rules that dictate the number of stem values, and you can split the stems as needed
True
a table consisting of rows and columns that is used to classify data on two dimensions
contingency table
(look at camera roll for example pic of a contingency table)
Contingency tables…
- classify data on _____ dimensions.
- Rows classify according to one dimension and columns classify according to a second dimension.
- two
(association between 2 variables– one will be giving us info about rows, one will be giving us info about columns)
Contingency tables require what 3 variables?
- Row variable
- Column variable
- Variable counted in the cells
What is the purpose of contingency tables?
To investigate possible relationships between variables
_______ _____ are used to study the relationship between two quantitative variables.
Scatter plots
(place one variable on the x-axis, place a second variable on the y-axis, place dot on pair coordinates)
Types of Relationships– Scatter Plots…
a straight line relationship between the two variables
linear
Types of Relationships–Scatter Plots…
when one variable goes up, the other variable goes up
positive (ex: as x goes up, y goes up… OR as x goes down, y goes down)
Types of Relationships–Scatter Plots…
when one variable goes up, the other variable goes down
negative (ex: as x goes up, y goes down… OR as x goes down, y goes up)
Types of Relationships–Scatter Plots…
there is no coordinated linear movement between the two variables
no linear relationship
(when slope = 0)
when all dots are in one line and going up
perfect positive correlation
when all dots are in one line and going down
perfect negative correlation
when dots are going up but not in one line (weak relationship)
low degree of positive correlation
when dots are going down but not in one line (weak relationship)
low degree of negative correlation
when dots are close to line and going up (strong positive relationship)
high degree of positive correlation
when dots are close to line and going down (strong negative relationship)
high degree of negative correlation
The purpose of constructing a scatter plot is to look at relationships between two quantitative variables. We can see what 3 things?
- Direction (positive or negative)
- Form (strong or weak relationship)
- Linear or non-linear relationships
T or F: A graph is misleading if the vertical axis does not start from 0.
True
T or F: A graph is misleading if all bars are not of equal width.
True (all bars should we equal width)
T or F: Pie charts are misleading when the percentages do not equal 100 when added up.
True (sum of all percentages must be = to 100%)
T or F: Graphs are misleading when they are missing titles or axis titles.
True (all graphs should be labeled)
T or F: Graphs can be misleading when data isn’t consistent or missing (doesn’t give any info about data or a clear story of data).
True (ex: on x-axis of a bar chart, for some of the bars it is labeled with the year, then it switches to the month and year)
provides a graphical presentation of the current status and historical trends of key performance indicators; gives us info about all descriptives on one page (tells us what has happened)
dashboard
What are the 4 things included in a dashboard?
- Gauges
- Bullet Graph
- Treemaps
- Sparklines
Included in dashboard…
charts that present data similar to a speedometer
Gauges
Included in dashboard…
features a single measure that extends into ranges representing qualitative measures of performance
bullet graph
Included in dashboard…
display information in a series of clustered rectangles
treemaps
Included in dashboard…
line chart drawn without axes to embed in text (makes data easier to see)
sparklines
T or F: With a bullet graph, the line tells you whether you’ve achieved your criteria or not… if the line doesn’t meet criteria line, then the criteria has not been met.
True