Chapter 1 Flashcards
Association
Occurs between two variables if specific values of one variable tend to occur in common w specific values of the other
The 1.5 x IQR Rule for Outliers
Call an observation an outlier if it falls more than 1.5 x IQR above the third quartile or below the first quartile
Back to back stemplot
Used to compare the distribution of quantitative variable for two groups
Stem
Each observation in both groups is separated into a stem
Leaf
The final digit
Bar graph
Used to display the distribution of a categorical variable or to compare the sizes of quantities
The horizontal axis identifies categories
Bimodal
Describes a graph of quantitative data w two clear peaks
Box plot
A graph of a five number summary
The box spans the quartiles & shows the spread of the central half of distribution
Lines extend from the box to the extremes and show the full spread of data
Categorical variable
Places an individual into one of several groups or categories
Conditional distribution
Describes the values of one variable among individuals who have a specific value of another variable
There is a separate conditional distribution for each value of the other variable
Data analysis
A process of describing data using graphs & numerical summaries
Dotplot
A simple graph that shows each data value as a dot above its location on a number line
Distribution
Tells what values a variable takes & how often it takes these values
First Quartile
If the observations in a data set are ordered from lowest to highest, the first quartile is the median of the observations whose position is to the left of the median
The Five Number Summary
Consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest
Frequency Table
Displays the count of observations in each category or class
Histogram
Displays the distribution of quantitive variable
The lines touch
Individuals
Objects described by a set of data
Inference
Drawing conclusions that go beyond the data at hand
Interquartile range
IQR = Q3 - Q1
Marginal distribution
Of one of the categorical variables in a two way table of counts is the distribution of values that variable among all individuals described by the table
Mean
The average
Median
Midpoint of data
Mode
The value or class in a statistical distribution having the greatest frequency
Multimodal
Describes a graph of quantitative data w more than two clear peaks
Outlier
An individual value that falls outside the overall pattern of a distribution
Overall pattern
In any graph of data, look for the overall pattern and for striking departures from that pattern
Shape, center, and spread describe the overall pattern of the distribution of a quantitive variable
Pie chart
Shows the distribution of a categorical variable as a pie whose slices are sized by the counts or percents for the categories
Quantitive variable
Takes numerical values for which it makes sense to find an average
Range
Maximum value minus the minimum value
Relative frequency table
Shows the percents of observations in each category
Resistant measure
A statistic that is not affected very much by extreme observations
Roundoff error
The difference between the calculated approximation of a number and it’s exact mathematical value
Simpson paradox
An association between two variables that holds for each individual value of a third variable can be changed or even reversed when the data for all values of the third variable are combined
Skewed to the right
If the right side of the graph is much longer than the left
Skewed to the left
Left side is longer than the right side
Splitting stem
A method for spreading out a stem plot that has few stems
Stem plot
A simple graphical display for fairly small data set that gives a quick picture of the shape of a distribution while including the actual numerical values
Symmetry
If the right & left sides of a graph are mirror images of each other
Third Quartile
If the observation in a data set are ordered from lowest to highest, the third Quartile is the median
Unimodal
Describes graph of quantitive data w a single peak
Variables
Any characteristic of an individual, can take different values from different individuals
Variance
The average squared distance of the observations in a data set from their mean
Standard deviation
How far each score is from the mean in average
SOCS
Shape
Outliers
Centers
Spread
Shape
Symmetric & shape
Center
Mean
Median
Spread
Standard deviation
Range
IQR
Rule for outliers
Q1- 1.5(IQR)
Q3+ 1.5(IQR)
Box plot
A central box is drawn from Q1 to the Q3
Draw number line
Mark median
Lines mark min and max
Stop line before outlier than Dot outlier
Mean & standard deviation
Symmetric
Median & IQR
Skewed
When mean & median are lose
It will be symmetric
Use bar graph for
Categorical data
Use histogram for
Quantative data