Chapter 2 Flashcards

1
Q

What’s a variable?

A

characteristic of a person or thing that can be assigner a number or category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a categorical variable?

A

-no obvious order
-blood type, gender, colors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a numeric variable?

A

can be ordered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a discrete numeric variable?

A

-no fractions, whole numbers
-number of children, length of DNA sequence n basepairs
-number of classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a continuous numeric variable?

A

-does have fractions
-weight of a baby, cholesterol concentration in blood sample, height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an observational unit?

A

-sometimes we sample n persons or things and collect multiple variables for each
-so the sample is the observational unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can a frequency distribution be displayed?

A

a table or even a bar chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When making figures and comparing multiple figures what should you do?

A

-always label the axes, check the axes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is relative frequency?

A

count divided by the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CDC versus NYT figures

A

-CDC shows a smooth transition time wise and this figure only shows two age groups while the NYT shows a range of ages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Dotplot example

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Histogram example

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the area of one or several bars proportional to in a histogram?

A

the corresponding frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What decision do we have to make with continuous numeric variables?

A

how to group the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the characteristics of a bell-shaped curve? (Gaussian or normal)

A

symmetric and unimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a bimodal figure look like?

A

e.g. male and female height cause two modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does an asymmetric graph that is skewed to the right look like?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does an asymmetric graph skewed to the left?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does an exponential figure look like and what is an example?

A

e.g. wait times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a statistic?

A

-a numeric measure calculated from sample data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the median?

A

-a measure of center and is the value that most nearly lies in the middle of the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the mean?

A

average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does it mean if a statistic is robust and are the mean and median robust?

A

-relatively unaffected by changes in a small portion of the data
-median is unchanged meaning it is robust
-mean changes so it is not robust

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is another measure of center?

A

trimmed mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the characteristics of a box plot?

A

-the median splits the distribution into two parts (upper and lower)
-the quartile splits each of these parts in half
-the first quartile Q1 splits the lower, and
-the third quartile Q3 splits the upper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the interquartile range? (IQR)

A

the difference between the third and first quartiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Boxplot (with no outliers) example

A
28
Q

What does the boxplot quickly show?

A

center, spread of total distribution, spread of middle 50% distribution, and skewness

29
Q

What is an outlier?

A

-any data point lower than the lower fence or higher than the upper fence is an outlier
-could be a mistake in measurement or in the experiment

30
Q

What is the lower fence?

A

Q1 - (1.5 x IQR)

31
Q

What is the upper fence?

A

Q3 + (1.5 X IQR)

32
Q

How far do whiskers extend?

A

only to the smallest and largest data points that are not outliers

33
Q

How are outliers identified in a boxplot?

A

dots (or other symbols)

34
Q

Why treat outliers differently?

A

not representative could be error

35
Q

Violin plot (combine boxplot and histogram) example

A
36
Q

How can you consider the relationship between multiple variables (multivariate data)?

A

stacked bar charts

37
Q

Stacked relative frequency charts example

A
38
Q

Side-by-side jittered dotplots example

A
39
Q

Side-by-side boxplots

A
40
Q

Scatterplot example

A
41
Q

What are some examples of measures of center?

A

median, mean, trimmed mean

42
Q

What are some measures of dispersion?

A

Range, IQR, Sample Standard Deviation

43
Q

Is the range robust?

A

No

44
Q

Is the IQR robust?

A

more robust than the range there is a slight shift though

45
Q

In a sample standard deviation what does the sum of the deviations equal and what does the average of the deviations equal?

A

zero

46
Q

What is the formula for the sample standard deviation?

A
47
Q

What is the unit fo the sample standard deviation?

A

the same units as the observations

48
Q

What is the sample variance?

A

s^2

49
Q

Is the standard deviation robust?

A

no because it depends on the mean

50
Q

For normal distributions what percent observation are within +-1 SD of the mean?

A

68%

51
Q

For normal distributions what percent observation are within +-2 SD of the mean?

A

95%

52
Q

For normal distributions what percent observation are within +-3 SD of the mean?

A

99.7%

53
Q

How does a linear transformation affect the median?

A

it doesn’t change the order of the data
-if we multiply a number then multiply the median by that number
-if we add a number add that number to the median

54
Q

How does a linear transform affect Q1 and Q3?

A

same as the median
-if we multiply a number then multiply Q1 and Q3 by that number
-if we add a number add that number to the Q1 and Q3

55
Q

How does a linear transform affect the IQR?

A

-if we add a number then no change
-if we multiply a number then there is a change

56
Q

How does linear transformation affect the mean?

A

-same as the median
-scale it according to the transformation meaning if you add then add that number to the mean and if you multiply then multiply the mean by that number as well

57
Q

How does a linear transform affect the SD?

A

-same as IQR
-adding and subtracting does not affect it but multiplying and dividing does affect it

58
Q

What is the coefficient of variation and what is it a measure of?

A

SD/mean; measure of dispersion

59
Q

How does scaling work in regards to nonlinear transformation like log?

A

-no scaling have to take the log of all the data and recalculate mean, median, and STDEV

60
Q

What is another name for the sample value?

A

statistic

61
Q

What is another name for the population value?

A

parameter

62
Q

What is the sample value and population value for a proportion?

A

sample value = p hat p^ (^ over the p)
population value = p

63
Q

What is the sample value and population value for a mean?

A

sample value = y bar y- (- over y)
population value = µ

64
Q

What is the sample value and population value for a standard deviation?

A

sample value = s
population value = σ

65
Q

What is statistical inference?

A

Drawing conclusions on a population based on observations from a sample