Assignment 1 Flashcards
Exploratory Data Analysis
A method and philosophy of data analysis begun by John Tukey which is designed to uncover information in data without interference of outlying values.
Resistance
An EDA property in which a calculation is not highly affected by outlying data values.
Re-expression
An EDA principle in which the display of data is aided by the use of nonlinear transformations, such as a logarithm or square root.
Residuals
The difference between a measurement and the value of the measurement that is predicted by some mathematical model.
Revelation
The primary goal of EDA in which one can see information carried by one’s data.
Glyph
An image that communicates information without words.
Median
An average that is the middle number in an order set of data. The median has half the data below it and half above it.
Upper & Lower Hinges
An EDA term for the median of the upper half of a batch of data (upper hinge) and median of the lower half of a batch of data (lower hinge).
Hinge Spread
An EDA term that is the difference between the upper and lower hinges. The hinge spread is often called the fourth spread.
Stem-and-Leaf Diagram
An EDA figure that displays a distribution of data.
Side-by-side stem-and-leaf diagram
Two stem-and-leaf diagrams placed next to each other that use a common set of stems.
One-line summary
A stem-and-leaf diagram in which the leaves of each stem are shown on one line.
Two-line summary
A stem-and-leaf diagram in which the leaves of each stem are shown on two lines.
Five-line summary
A stem-and-leaf diagram in which the leaves of each stem are shown on five lines.
Box plot
An EDA schematic diagram comprised of a box and two lines that show the distribution of data.