Unit 1 - Topics 6 and 7 Flashcards

1
Q

Distribution

A

The distribution of a quantitative variable slices up all the possible values of the variable into equal-width bins and gives the number of values (or counts) falling into each bin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Histogram (relative frequency histogram)

A

A histogram uses adjacent bars to show the distribution of a quantitative variable. Each bar represents the frequency (or relative frequency) of values falling in each bin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Gap

A

A region of the distribution where there are no values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Stem-and-Leaf Display

A

A stem-and-leaf display shows quantitative data values in a way that sketches the distribution of the data. It’s best described in detail by example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Dot plot

A

A dot plot graphs a dot for each case against a single axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Shape

A

To describe the shape of a distribution, look for the following:
-single vs. multiple modes
-symmetry vs. skewness
-outliers and gaps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Center

A

The place in a distribution of a variable that you’d point to if you wanted to attempt the impossible by summarizing the entire distribution with a single number. Measures of the center include the mean and median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Spread

A

A numerical summary of how tightly the values are clustered around the center. Measures of spread include the IQR and standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Mode

A

A hump or local high point in the shape of the distribution of a variable. The apparent location of modes can change the scale of a histogram is changed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unimodal (Bimodal)

A

Having one mode. This is a useful term when describing the shape of a histogram when it’s generally mound-shaped. Distributions with two modes are called bimodal. Those with more than two are multimodal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Uniform

A

A distribution that’s roughly flat is said to be uniform.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Symmetric

A

A distribution is symmetric if the two halves on either side of the center look approximately like mirror images of each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Tails

A

The tails of a distribution are the parts that typically trail off on either side. Distributions can be characterized as having long tails (if they straggle off for some distance) or short tails (if they don’t).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Skewed

A

A distribution is skewed if it’s not symmetric and one tail stretches out farther than the other. Distributions are said to be skewed left when the longer tail stretches to the left and skewed right when it goes to the right.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Outliers

A

Outliers are extreme values that don’t appear to belong with the rest of the data. They may be unusual values that deserve further investigation, or they may just be mistakes; there’s no obvious way to tell. Don’t delete outliers automatically-you have to think about them. Outliers can affect many statistical analyses, so you should always be alert to them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Median

A

The median is the middle value, with half of the data above and half below it. If n is even, it is the average of the two middle values. It is usually paired with the IQR.

17
Q

Range

A

The difference between the lowest and highest values in a data set. Range=max-min.

18
Q

Quartile

A

The lower quartile (Q1) is the value with a quarter of the data below it. The upper quartile (Q3) has three-quarters of the data below it. The median and quartiles divide data into four parts with equal numbers of data values.

19
Q

Interquartile Range (IQR)

A

The IQR is the difference between the first and third quartiles. IQR=Q3-Q1. It is usually reported along with the median.

20
Q

Percentile

A

The (i)th percentile is the number that falls above i% of the data.

21
Q

5-Number Summary

A

The 5-number summary of a distribution reports the minimum value, Q1, the median, Q3, and the maximum value.

22
Q

Mean

A

The mean is found by summing all the data values and dividing by the count. It is usually paired with the standard deviation.

23
Q

Resistant

A

A calculated summary is said to be resistant if outliers have only a small effect on it.

24
Q

Variance

A

The variance is the sum of squared deviations from the mean divided by the count minus 1.

25
Q

Standard Deviation

A

The standard deviation is the square root of the variance. It is usually reported with the mean.