Chapter 2 Flashcards

1
Q

What’s a variable?

A

characteristic of a person or thing that can be assigner a number or category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a categorical variable?

A

-no obvious order
-blood type, gender, colors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a numeric variable?

A

can be ordered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a discrete numeric variable?

A

-no fractions, whole numbers
-number of children, length of DNA sequence n basepairs
-number of classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a continuous numeric variable?

A

-does have fractions
-weight of a baby, cholesterol concentration in blood sample, height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an observational unit?

A

-sometimes we sample n persons or things and collect multiple variables for each
-so the sample is the observational unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can a frequency distribution be displayed?

A

a table or even a bar chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When making figures and comparing multiple figures what should you do?

A

-always label the axes, check the axes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is relative frequency?

A

count divided by the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CDC versus NYT figures

A

-CDC shows a smooth transition time wise and this figure only shows two age groups while the NYT shows a range of ages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Dotplot example

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Histogram example

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the area of one or several bars proportional to in a histogram?

A

the corresponding frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What decision do we have to make with continuous numeric variables?

A

how to group the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the characteristics of a bell-shaped curve? (Gaussian or normal)

A

symmetric and unimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a bimodal figure look like?

A

e.g. male and female height cause two modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does an asymmetric graph that is skewed to the right look like?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does an asymmetric graph skewed to the left?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does an exponential figure look like and what is an example?

A

e.g. wait times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a statistic?

A

-a numeric measure calculated from sample data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the median?

A

-a measure of center and is the value that most nearly lies in the middle of the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the mean?

A

average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does it mean if a statistic is robust and are the mean and median robust?

A

-relatively unaffected by changes in a small portion of the data
-median is unchanged meaning it is robust
-mean changes so it is not robust

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is another measure of center?

A

trimmed mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are the characteristics of a box plot?
-the median splits the distribution into two parts (upper and lower) -the quartile splits each of these parts in half -the first quartile Q1 splits the lower, and -the third quartile Q3 splits the upper
26
What is the interquartile range? (IQR)
the difference between the third and first quartiles
27
Boxplot (with no outliers) example
28
What does the boxplot quickly show?
center, spread of total distribution, spread of middle 50% distribution, and skewness
29
What is an outlier?
-any data point lower than the lower fence or higher than the upper fence is an outlier -could be a mistake in measurement or in the experiment
30
What is the lower fence?
Q1 - (1.5 x IQR)
31
What is the upper fence?
Q3 + (1.5 X IQR)
32
How far do whiskers extend?
only to the smallest and largest data points that are not outliers
33
How are outliers identified in a boxplot?
dots (or other symbols)
34
Why treat outliers differently?
not representative could be error
35
Violin plot (combine boxplot and histogram) example
36
How can you consider the relationship between multiple variables (multivariate data)?
stacked bar charts
37
Stacked relative frequency charts example
38
Side-by-side jittered dotplots example
39
Side-by-side boxplots
40
Scatterplot example
41
What are some examples of measures of center?
median, mean, trimmed mean
42
What are some measures of dispersion?
Range, IQR, Sample Standard Deviation
43
Is the range robust?
No
44
Is the IQR robust?
more robust than the range there is a slight shift though
45
In a sample standard deviation what does the sum of the deviations equal and what does the average of the deviations equal?
zero
46
What is the formula for the sample standard deviation?
47
What is the unit fo the sample standard deviation?
the same units as the observations
48
What is the sample variance?
s^2
49
Is the standard deviation robust?
no because it depends on the mean
50
For normal distributions what percent observation are within +-1 SD of the mean?
68%
51
For normal distributions what percent observation are within +-2 SD of the mean?
95%
52
For normal distributions what percent observation are within +-3 SD of the mean?
99.7%
53
How does a linear transformation affect the median?
it doesn't change the order of the data -if we multiply a number then multiply the median by that number -if we add a number add that number to the median
54
How does a linear transform affect Q1 and Q3?
same as the median -if we multiply a number then multiply Q1 and Q3 by that number -if we add a number add that number to the Q1 and Q3
55
How does a linear transform affect the IQR?
-if we add a number then no change -if we multiply a number then there is a change
56
How does linear transformation affect the mean?
-same as the median -scale it according to the transformation meaning if you add then add that number to the mean and if you multiply then multiply the mean by that number as well
57
How does a linear transform affect the SD?
-same as IQR -adding and subtracting does not affect it but multiplying and dividing does affect it
58
What is the coefficient of variation and what is it a measure of?
SD/mean; measure of dispersion
59
How does scaling work in regards to nonlinear transformation like log?
-no scaling have to take the log of all the data and recalculate mean, median, and STDEV
60
What is another name for the sample value?
statistic
61
What is another name for the population value?
parameter
62
What is the sample value and population value for a proportion?
sample value = p hat p^ (^ over the p) population value = p
63
What is the sample value and population value for a mean?
sample value = y bar y- (- over y) population value = µ
64
What is the sample value and population value for a standard deviation?
sample value = s population value = σ
65
What is statistical inference?
Drawing conclusions on a population based on observations from a sample