Chapter 2 - Descriptive Statistics Flashcards

Question 1

Q

John Tukey

Answer

A

1915 - 2000
exploratory data analysis (EDA) = boxplots, stem-and-leaf plots
coined terms such as bit and software

Question 2

Q

Features of a good numeric or graphic form of data submission

Answer

A

self-contained
understandable without reading the text
clearly labeled of attributes with well-defined terms
indicate principal trends in data

Question 3

Q

Measures of location

Answer

A

also known as measures of central tendency
data summarization is important before any inferences can be made
measure of location is useful for data summarization that defines the center or middle of the sample

Question 4

Q

Arithmetic mean limitation

Answer

A

oversensitive to extreme values

- in which case, it may not be representative of the location of the majority of sample points

Question 5

Q

Symmetric distribution

Answer

A

arithmetic mean is approximately the same as the median

Question 6

Q

Positively skewed distribution

Answer

A

tail end is on the right side

- arithmetic mean tends to be larger than the median

Question 7

Q

Negatively skewed distribution

Answer

A

tail end is on the left side

- arithmetic mean tends to be smaller than the median

Question 8

Q

Mode

Answer

A

the most frequently occurring value among all the observations in a sample
data distributions may have one or more modes (unimodal, bimodal, trimodal, etc.)

Question 9

Q

Range

Answer

A

the difference between the largest and smallest observations in a sample
range is very sensitive to extreme observations or outliers
larger the sample size n, the larger the range tends to be and the more difficult the comparison between ranges from data sets of varying sizes

Question 10

Q

Quantiles or percentiles

Answer

A

a better approach than range to quantifying the spread in data sets is percentiles or quantiles
percentiles are less sensitive to outliers and are not greatly affected by the sample size

Question 11

Q

Standard deviation

Answer

A

standard deviation is a reasonable measure of spread if the distribution is bell-shaped

Question 12

Q

Grouped data

Answer

A

when sample size is too large to display all the raw data, data are frequently collected in grouped form
the simplest way to display the data is to generate a frequency distribution using a statistical package

Question 13

Q

Frequency distribution

Answer

A

frequency distribution = ordered display of each value in a data set together with its frequency
if the number of unique sample values is large, then a frequency distribution may still be too detailed
if the data is too large, then the data is categorized into broader groups

Question 14

Q

Types of grouped data

Answer

A

bar graphs
stem and leaf plots
box and whisker plot
scatter plot
histogram

Question 15

Q

Bar graphs

Answer

A

identity of the sample points within the respective groups is lost

Question 16

Q

Stem and leaf plots

Answer

Study These Flashcards

A

easy to compute the median and other quantities
each data point is converted into stem and leaf
the collection of leaves indicates the shape of the data distribution

Question 17

Q

Box and whisker plot

Answer

Study These Flashcards

A

uses the relationships among the median, upper quartile, and lower quartile to describe the skewness or symmetry of a distribution
a vertical bar connects the upper quartile to the largest non-outlying value in the sample
a vertical bar connects the lower quartile to the smallest non-outlying value in the sample

Question 18

Q

Box and whisker plot (symmetric)

Answer

Study These Flashcards

A

upper and lower quartiles should be approximately equally spaced from the median

Question 19

Q

Box and whisker plot (positively skewed)

Answer

Study These Flashcards

A

upper quartile is farther from the median than the lower quartile

Question 20

Q

Box and whisker plot (negatively skewed)

Answer

Study These Flashcards

A

lower quartile is farther from the median than the upper quartile

Question 21

Q

Box and whisker plot (outlying value)

Answer

Study These Flashcards

A

x > upper quartile + 1.5 IQR

- x < lower quartile - 1.5 IQR

Question 22

Q

Box and whisker plot (extreme outlying value)

Answer

Study These Flashcards

A

x > upper quartile + 3.0 IQR

- x < lower quartile - 3.0 IQR

Chapter 2 - Descriptive Statistics Flashcards

(22 cards)