Final Exam Flashcards

1
Q

Every statistical conclusion is stated in terms of ________

A

Probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Statistical calculations do NOT yield DEFINITE _______

A

Conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Central tendency

A

Middle of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you find the arithmetic mean/ what is the equation?

A

Add up all the values and divide by the number of observations / mean=sum(values)/n OR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The arithmetic mean is also known as the?

A

Average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Is the arithmetic mean tolerant to outliers?

A

NO!!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you find the median if there is a even number of values?

A

Rank the values from lowest to highest and average the middle two

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you find the median if there is an odd number of values?

A

Rank the values from lowest to highest and the center one is the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which is the better choice when quantifying the central tendency of a data set: Mean or median?

A

The median bc it is more tolerant to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the three main steps to finding the geometric mean? -> Think logs

A

1) Transform all the values into their logarithms
2) Compute the mean of these logarithms
3) Take the antilog of that mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is the geometric mean tolerant to the presence of outliers?

A

YES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When calculating the geometric mean, all the values must be _______

A

Positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the equation for the geometric mean?

A

Geometric mean= ((X1)(X2)(X3)……(XN))^1/N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the three main steps to finding the harmonic mean? -> Think reciprocals

A

1) Transform each value to its reciprocal
2) Compute the arithmetic mean of those reciprocals
3) Take the reciprocal of this mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The harmonic mean can’t be computed when the values are _______ or ______

A

Zero / negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is the harmonic mean more stable when outliers are present?

A

YES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the equation for the harmonic mean

A

N/(1/X1 + 1/X2 + 1/X3 + 1/X4 + ……….. + 1/XN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do you find the trimmed mean?

A

Ignore or “trim off” the highest and lowest values and take the arithmetic mean of the values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the purpose of the trimmed mean?

A

It’s a simple/ primitive way of getting rid of outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Is the trimmed mean tolerant to outliers?

A

Kind of? It’s more tolerant than the arithmetic mean, but not as tolerant as the other methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the 5 ways of quantifying the central tendency of of a data set?

A

1) Arithmetic mean
2) Median
3) Geometric mean
4) Harmonic mean
5) Trimmed mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do you find the mode?

A

It’s the values that occurs most commonly in the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Does the mode always assess the center of the distribution?

A

No, not always

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Can you use the mode (. ) function in R to find the mode?

A

NO: This tells you what kind of data it is. To find the mode you’d have to use other functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the difference between a continuous and discrete variable?
A continuous variable is well connected with time -> Think of a continuous variable as a straight line and a discrete variable as individual spots or data points
26
What is an interval variable? (3)
1) An interval variable is a type of continuous variable 2) You can calculate the difference of the two of this kind of variables, but can't make any sense of the ratio bc you don't get the same thing 3) On the scale the zero is defined arbitrarily
27
When should you use the harmonic mean?
When you are dealing with proportion, rates, and ratios
28
What are examples of an interval variable?
Temp in degrees C and F, pH, credit score
29
What is a ratio variable (3)
1) A type of continuous variable 2) You can calculate both the difference and the ratio of the data and have it make sense/ be meaningful 3) Zero on the scale is not defined arbitrarily
30
What are examples of a ratio variable?
Temp in Kelvin, distance, length, height, weight (in any units- metric and english system)
31
What are the two types of continuous variables we've disucssed?
Interval and ratio variables
32
What is an ordinal variable? (3)
1) It is not a type of continuous variable 2) It MUST express rank 3) The order of the values matter, but not the exact number
33
2 star vs 4 star hotel ranking, credit scores, and test scores are an example of what type of variable?
Ordinal variable
34
What is a nominal variable? (2)
1) It is not a type of continuous variable | 2) It is used to describe the data with multiple categorical outcomes
35
Passing and failing classes is an example of what type of variable?
Nominal variable
36
What kind of variable is the color spectrum?
Some variables like the color spectrum can be quantified and treated as a ratio variable (in terms of the wavelength of each color) or as an ordinal variable (in terms of the normal orders of the colors)
37
What does the point prevalence tell us?
it refers to the proportion of participants with a risk factor or disease at a particular point in time
38
What is the equation for point prevelance?
PP= number of people with the disease/ number of people examined (usually at baseline)
39
What does relative risk or risk ratio (RR) tell us?
It is a useful measure to compare the prevalence or incidence of disease between two groups
40
What is the risk ratio essentially?
Just a ratio of the point prevalences between a group and a reference group
41
What is the equation for relative risk (risk ratio, RR)?
Relative risk (risk ratio, RR)= Point prevalence of exposed or experimental arm/ Point prevalence of exposed or control AKA reference group
42
What does it mean if you get a relative risk or risk ratio of 1?
A relative risk of 1 means exposure to the risk factor is UNRELATED to the risk of the disease .: If a risk factor is related to the risk of disease, relative risk will not equal 1.
43
What is the 2 equations to find the odds?
Odd= Number of event/ number of nonevent / Odd= Point prevalence exposed or unexposed experimental arm or the control/ 1- Point Prevalence exposed or unexposed experimental arm of the control
44
How is the odds in statistics different from regular probability?
Odds in statistics= number of event / number of NONEVENT, whereas a regular probability= number of event/ TOTAL NUMBER OF EVENTS
45
What is the equation for the odds ratio?
Odd exposed or experimental arm/Odd unexposed or the control
46
What does it mean if the odds ratio is 1?
It means that exposure to the risk factor is unrelated to the risk of disease .: If a risk factor is related to the risk of disease, odds ratio will not equal 1.
47
What units does standard deviation have?
The same ones as the data
48
What is the rule of thumb for interpreting standard deviation?
The mean plus or minus 1 SD houses about 2/3 of the data (68/3%) and the mean plus or minus 2 SD houses about 95-95.4% of the data
49
What are the 6 main steps you use to calculate calculate the standard deviation of a sample?
1) Calculate the arithmetic mean 2) Calculate the difference bt each value and the mean -> Called the deviation 3) Square those differences 4) Add up those squared differences 5) Divide that sum by ( n-1), where n is the number of values -> Called the variance 6) Take the square root of the variance you calculated
50
When calculating standard deviation, why do we use n-1, instead of n
n-1 is called the degrees of freedom and it's used bc the sample variance (the square of the SD) computed using (n-1) is an unbiased estimate of the population variance
51
Is the SD computed with n-1 as the denominator, the most accurate estimate of the population SD?
No, it is a bias estimate of the population SD, which means, on average, it does not equal to the population SD. It's used bc the sample variance (the square of the SD) computed using (n-1) is an unbiased estimate of the population variance
52
Can the mean or median equal zero or negative?
YES
53
Can the standard deviation be zero?
Yes, when all the data values are the same
54
Can the standard deviation be negative?
No
55
Can the mean and median be computed when n=1? When n=2?
The mean and median can be computed when n=2, but it does not make sense to calculate them when n=1 bc the mean/ median would just be equal to that data value
56
Can SD be calculated when n=1? When n=2?
SD can be calculated when n=2, but not when n=1
57
What kind of variable is the coefficient of variation (CV) for?
Ratio variables
58
When would you prefer to use CV to display variabilities?
When you're trying to compare 2 or more data sets
59
Does the coefficient of variation (CV) have units?
NO
60
What is the equation for coefficient of variation (CV)?
CV= SD/mean
61
What does a larger CV indicate?
If the CV of one data set is larger than the CV of another data set, then the data set with a larger CV has greater variability, meaning the data points are relatively more different from each other
62
What is the equation for variance?
It is the square of SD
63
What are the units for variance?
The same units as the data but squared
64
What are quantiles?
The values or cuts made dividing the range of a data set into q continuous intervals/ subsets with equal size
65
If there are q subsets, how many quantiles?
q-1
66
What is a good way to think of quantiles?
Think of them as the cuts made to cut data into subsets w/ equal observations
67
What are quantiles used for?
To quantify scattered data
68
In order to make 3 subsets, how many quantiles or cuts do we have to make?
2
69
In order to make 4 subsets, how many quantiles do we have to make?
3
70
What is a quartile?
A quartile is just a special type of quantile that divides the data up 3 times so that we have 4 subsets (q=4), creating the 25th, 50th (where the median occurs), and 75th percentile
71
What is a percentile?
A percentile is just a special type of quantile that divides the data up into 100 equal subsets (q=100), creating 99 percentiles
72
Is there such thing as a 100th percentile?
NO
73
At what percentile does the median occur?
The 50th
74
How do you find the interquartile range?
You subtract the first quartile (25th percentile) from the third quartile (75th percentile)
75
What units does the interquartile rang ehave?
The same units as the data
76
What does Xth percentile tell us?
That that percentile is a value where X% of the values in a data set lie below
77
How many different ways are there to calculate the interquartile range and which one should you use to calculate it on the exams?
3 / The 1st