HW3 CH2 - measure of center, variation, 5 # Sum, Box Plots Flashcards

1
Q

Define the measure of center

A

Descriptive measure that reveals the center or most typical values of a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample mean?

A

sum of all values divided by the total number of observations in the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do you obtain the sample mean?

A

add all the data and divide it by how much data there is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the symbol for sample mean?

A

x with a line above it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the symbol for population mean?

A

the u with a tail, “mu”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is a median?

A

A number that divides the top 50% of the data from the bottom 50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do you find the median?

A

rearrange numbers from least to greatest, odd # is in the middle, even # is (add both middle #’s)/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is mode?

A

the value that occurs the most often in the data set, frequency > 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Is it possible for a data set to have 2 or more mode? (T/F)

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is resistant measure?

A

a measure is robust (resistant) if extreme values have little to no influence on its outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a robust measure, mean or median?

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is measures of Variation (Dispersion)?

A

descriptive measures that describe how much variation or “spread” there is in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is range?

A

The difference between the largest observation and the smallest observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the disadvantages of range?

A
  1. measure is based only on 2 values
  2. not resistant: highly susceptible to outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is deviation?

A

The difference between an observation and the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a sample standard deviation?

A

Roughly on average, the difference between an observation and the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Is range resistant?

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Does range show how spread out the data is?

19
Q

is standard deviation robust?

20
Q

Why transform data?

A

changing units, making the shape symmetric, make the relationship between 2 variables linear

21
Q

define parameter

A

numerical summary oof the population

22
Q

define statistic

A

numerical summary of the sample

23
Q

define quartiles

A

this divides the data set into 4 equal parts

24
Q

What is the interquartile range?

A

the difference between the third and the

25
Q

What is the 5 number summary?

A

is consists of the info:
1. minimum value
2. first quartile
3. second quartile (median)
4. third quartile
5. maximum value

26
Q

What is an outlier

A

A value that is distant from other observations in the data set

27
Q

define a boxplot

A

a graph that displays the distribution of a data set using the 5 number summary , which we can easily see the outliers

28
Q

what advantage does histogram have against boxplot?

A

displays more information about the distribution of a data set

29
Q

Define Dot plot

A

a graphical display of data using dots (dot = value in data set) limit value grouping

30
Q

define stem and leaf plot

A

a table in which each possible value is split into a “stem” (1st digit) and “leaf” (last digit)

31
Q

What are the advantages of stem leaf and dotplot?

A

displays all possible values in the data set

32
Q

what are the disadvantages in the stem leaf and dotplot?

A

When the data set is large this will not be informative, use a histogram instead

33
Q

what is a histogram?

A

a graph is drawn using vertical bars.
bar height = frequency

34
Q

what does a frequency histogram

35
Q

what do outliers affect?

A

Mean and standard deviation (not resistant measures)

36
Q

what is the degrees of freedom?

A

n-1 of the sample variance

37
Q

Name the 4-step process to organize a statistical problem

A

state: what is the practical question?
plan: what specific statistical operations does this problem call for?
solve: analyze the data with graphs and computations
conclude: give your practical conclusion

38
Q

The mean is a measure of center whereas the standard deviation measures the ____________ of data about the mean.

A

variability

39
Q

The line in the box of a boxplot marks where the __________ is.

40
Q

Standard Deviation measures…

A

variability of data about the mean or the difference between an observation and the mean

41
Q

what is deviation?

A

The difference between an observation and the mean xi - x

42
Q

how do you figure if a sample is an outlier?

A

if it is within the upper limit or the lower limit calculations (Q1+1.5 x IQR) and (Q3+1.5 x IQR)

43
Q

How do you find the interquartile range?

A

Q3 - Q1 = IQR