Chapter 3 - Representing Data Flashcards
what information do box plots give
the quartiles, maximum and minimum values, and outliers
how much of the data does one section of a box plot represent
25 percent
what is an outlier
an extreme value that lies outside of the overall pattern of data
what is cleaning data
removing any any anomalies and outliers from the data before representing it visually and using it to do calculations
what must you include when comparing data from boxplots
one statement about the spread (IQR preferably)
and one statement about location (median preferably)
what does a cumulative frequency diagram show
the running total of items up to a point
it can be used to find the quartiles within data
how do you draw a c.f. diagram
the y axis is the c.f., the x axis is the frequency
plot each point on the upper bound of the class
connect the points with straight lines
what is a histogram
a graph similar to a bar chart which shows the frequency density within classes
histograms show grouped continuous data
how can you draw a frequency polygon
mark points at the midpoint of the top of each class on a histogram then connect these points with straight lines
what is the relationship between the area and the frequency on a histogram, and what is the significancy
they are directly proportional
this means that dividing the area by the frequency will give a scaling factor for the f.d. for each bar on the histogram
what does frequency density represent
the frequency per unit data in each class
what do histograms show
the general distribution of data within each class which you can compare easily to other classes to see how the data is distributed in general
what must you do when given the height and width of a bar on a histogram, and asked to calculate the height and width of another
- calculate the area of the bar
- using the given data establish scaling constants for the width and area
- use the scaling to determine the width and area of the required bar, then find the height by dividing the area by the width
advantages and disadvantages of box plots
they can be used to easily compare the quartiles and distribution
they cannot be used to find the mean or mode
advantages and disadvantages of histograms
allow you to wuickly identify distribution patterns