CHAPTER 1 TERMS Flashcards

1
Q

The 1.5 × IQR Rule for Outliers

A

Call an observation an outlier if it falls more than 1.5 × IQR above the third quartile or below the first quartile.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Association

A

Occurs between two variables if specific values of one variable tend to occur in common with specific values of the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Back-to-back stemplot (also called a back-to-back stem-and-leaf plot)

A

Used to compare the distribution of a quantitative variable for two groups. Each observation in both groups is separated into a stem, consisting of all but the final digit, and a leaf, the final digit. The stems are arranged in a vertical column with the smallest at the top. The values from one group are plotted on the left side of the stem and the values from the other group are plotted on the right side of the stem. Each leaf is written in the row next to its stem, with the leaves arranged in increasing order out from the stem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bar graph

A

Used to display the distribution of a categorical variable or to compare the sizes of different quantities. The horizontal axis of a bar graph identifies the categories or quantities being compared. Drawn with blank spaces between the bars to separate the items being compared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bimodal

A

Describes a graph of quantitative data with two clear peaks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Boxplot

A

A graph of the five-number summary. The box spans the quartiles and shows the spread of the central half of the distribution. The median is marked within the box. Lines extend from the box to the extremes and show the full spread of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Categorical Variable

A

Places an individual into one of several groups or categories.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Conditional distribution

A

Describes the values of one variable among individuals who have a specific value of another variable. There is a separate conditional distribution for each value of the other variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data analysis

A

A process of describing data using graphs and numerical summaries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Dotplot

A

A simple graph that shows each data value as a dot above its location on a number line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Distribution

A

Tells what values a variable takes and how often it takes these values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

First quartile Q1

A

If the observations in a data set are ordered from lowest to highest, the first quartile Q1 is the median of the observations whose position is to the left of the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The Five-Number Summary

A

Consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest. In symbols, the five-number summary is Minimum Q1 M Q3 Maximum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Frequency table

A

Displays the count (frequency) of observations in each category or class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Histogram

A

Displays the distribution of a quantitative variable. The horizontal axis is marked in the units of measurement for the variable. The vertical axis contains the scale of counts or percents. Each bar in the graph represents an equal-width class. The base of the bar covers the class, and the bar height is the class frequency or relative frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Individuals

A

Objects described by a set of data. Individuals may be people, animals, or things.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Inference

A

Drawing conclusions that go beyond the data at hand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Interquartile range

A

IQR = Q3-Q1

19
Q

Marginal distribution

A

The marginal distribution of one of the categorical variables in a two-way table of counts is the distribution of values of that variable among all individuals described by the table.

20
Q

Mean

A

The arithmetic average. To find the mean x of a set of observations, add their values and divide by the number of observations.

21
Q

Median M

A

The midpoint of a distribution, the number such that half the observations are smaller and the other half are larger. To find the median of a distribution: 1. Arrange all observations in order of size, from smallest to largest. 2. If the number of observations n is odd, the median M is the center observation in the ordered list. 3. If the number of observations n is even, the median M is the average of the two center observations in the ordered list.

22
Q

Mode

A

The value or class in a statistical distribution having the greatest frequency.

23
Q

Multimodal

A

Describes a graph of quantitative data with more than two clear peaks.

24
Q

Outlier

A

An individual value that falls outside the overall pattern of a distribution. (AKA: Maria)

25
Q

Overall pattern

A

In any graph of data, look for the overall pattern and for striking departures from that pattern. Shape, center, and spread describe the overall pattern of the distribution of a quantitative variable.

26
Q

Pie chart

A

Shows the distribution of a categorical variable as a “pie” whose slices are sized by the counts or percents for the categories. A pie chart must include all the categories that make up a whole.

27
Q

Quantitative Variable

A

Takes numerical values for which it makes sense to find an average.

28
Q

Range

A

The range of a set of quantitative data is the maximum value minus the minimum value.

29
Q

Relative frequency table

A

Shows the percents (relative frequencies) of observations in each category or class.

30
Q

Resistant measure

A

A statistic that is not affected very much by extreme observations.

31
Q

Roundoff error

A

The difference between the calculated approximation of a number and its exact mathematical value.

32
Q

Segmented bar graph

A

Used to compare the distribution of a categorical variable in each of several groups. For each group, there is a single bar with “segments” that correspond to the different values of the categorical variable. The height of each segment is determined by the percent of individuals in the group with that value. Each bar has a total height of 100%.

33
Q

Side by side bar graph

A

Used to compare the distribution of a categorical variable in each of several groups. For each value of the categorical variable, there is a bar corresponding to each group. The height of each bar is determined by the count or percent of individuals in the group with that value.

34
Q

Simpson’s paradox

A

An association between two variables that holds for each individual value of a third variable can be changed or even reversed when the data for all values of the third variable are combined.

35
Q

Skewness

A

A distribution is skewed to the right if the right side of the graph (containing the half of the observations with larger values) is much longer than the left side. It is skewed to the left if the left side of the graph is much longer than the right side.

36
Q

Splitting stems

A

A method for spreading out a stemplot that has too few stems.

37
Q

Standard deviation sx

A

Measures the average distance of the observations from their mean. It is calculated by finding an average of the squared distances and then taking the square root .

38
Q

Stemplot (also called a stem-and-leaf plot)

A

A simple graphical display for fairly small data sets that gives a quick picture of the shape of a distribution while including the actual numerical values in the graph. Each observation is separated into a stem, consisting of all but the final digit, and a leaf, the final digit. The stems are arranged in a vertical column with the smallest at the top. Each leaf is written in the row to the right of its stem, with the leaves arranged in increasing order out from the stem.

39
Q

Symmetry

A

If the right and left sides of a graph are approximately mirror images of each other.

40
Q

Third quartile Q3

A

If the observations in a data set are ordered from lowest to highest, the third quartile Q3 is the median of the observations whose position is to the right of the median.

41
Q

Two-way table

A

A two-way table of counts organizes data about two categorical variables.

42
Q

Unimodal

A

Describes a graph of quantitative data with a single peak.

43
Q

Variables

A

Any characteristic of an individual. A variable can take different values for different individuals.

44
Q

Variance sx^2

A

The average squared distance of the observations in a data set from their mean.