Chapter 1 Flashcards

Question

Number of Peaks in a Histogram

Answer 1

The larger the number of class intervals, the more likely it is that bimodality or unimodality will manifest itself

Answer 2

A unimodal histogram with the left or lower tail stretched out more compared to the right or upper tail

Answer 3

A smoothed histogram

Answer 4

* The number of observations in a sample * Notation = *n* * If two samples are simultaneously under consideration, either m and n; or n₁and n₂

Answer 5

* The most familiar and useful measure of the center is the mean, or arithmetic average of the set - The only point at which a fulcrum can be placed to balance the system of weights is the point corresponding to the value of x-bar. * Notation: Sample Mean = x-bar * Notation: Population Mean = µ * µ = (sum of the N population values)/N

Answer 6

- Outlier can greatly affect the mean and make the mean an inappropriate measure of the center

Answer 7

* Obtained by first ordering the *n* observations from smallest to largest (with any repeated values included so that every sample observation appears in the ordered list)

Answer 8

* Quartiles divide the data set into 4 equal parts, with the observations above the **3rd quartile** constituting the upper quarter of the data set, the **2nd quartile** being identical to the median, and the **1st quartile** separating the lower quartile from the upper 3 quartiles

Answer 9

* Similarly, a data set (sample or population) can be even more finely divided using percentiles; the 99th percentile separates the highest 1% from the bottom 99%

Answer 10

* a compromise between the sample mean and the sample median * a 10% **trimmed mean**, for example, would be computed by eliminating the smallest 10% and the largest 10% of the sample and then averaging what remains

Answer 11

* A measure of variability * Computed by finding the difference between the largest and smallest sample values * Defect = **Range** only depends on the two most extreme observations and disregards the positions of the remaining n-2 values

Answer 12

* A deviation will be positive if the observation is larger than the mean (to the right of the mean on the measurement axis) and negative if the observation is smaller than the mean * If all the deviations are small in magnitude, then all xis are close to the mean and there is little variability. * Alternatively, if some of the deviations are large in magnitude, then some xis lie far from the mean suggesting a greater amount of variability.

Answer 13

* The average deviation is always zero

Answer 14

* denoted by *s*²

Answer 15

* denoted by *s* * Note that *s*² and *s* are both nonnegative. The unit for *s* is the same as the unit for each of the x_is.

Answer 16

* Denoted by σ² * For the population, the divisor is *N* and not *N*-1 * Note that σ²involves squared deviations about the population mean µ. * If we actually knew the value of µ, then we could define the sample variance as the average squared deviation of the sample x_is about µ. * However, the value of µ is almost never known, so the sum of squared deviations about x-bar must be used. * But the x_is tend to be closer to their average sample median than to the population average µ, so to compensate for this the divisor n – 1 is used rather than n

Answer 17

* Denoted by σ

Answer 18

* The **five number summary** consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest 1. Minimum 2. Q₁ 3. M 4. Q₃ 5. Maximum

Answer 19

1. Arrange the observations in increasing order and locate the 50th percentile, or the median M in the ordered list of obsercations 2. The 25th percentile, or the first quartile Q₁ is the median of the observations whose position in the ordered list is to the left of the overall median 3. The 75th percentile, or the third quartile Q₃ is the median of the observations whose position in the ordered list is to the right of the overall median

Answer 20

* A box-plot is a graph of the five number summary * Box-plots are most useful for side-by-side comparison of several distributions * Describes **location** and **spread** of a sample * Procedure: 1. Draw a rectangle with lower and upper edges at the 25th percentile, or 1st quartile, and the 75th percentile, or 3rd quartile 2. Draw a horizontal line across the rectangle at the median 3. Extend verticle lines, or whiskers, from the middle of the upper and lower edges of the rectangle to the minimum and maximum values