Segars Biostatistics Flashcards
define mode
number occurring most often
median
number in the middle
mean
average
range
max (-) min
what do outliers affect the most?
mean
one standard deviation above and below the mean includes what % of the population?
68%
two standard deviations above and below the mean includes what % of the population?
95%
three standard deviations above and below the mean includes what % of the population?
99.7%
curve tail points to the right it’s?
positive skew
curve tail points to the left it’s?
negative skew
what are the three families of data?
1- nominal
2-ordinal
3-interval
what is the first question you ask yourself before you check for consistency?
does it have magnitude?
how much everyone paid for their car. does this have magnitude?
yes
the model of everyone’s car. does this have magnitude?
no
what question do you ask yourself after you check for magnitude?
Is there consistency of scale?
ie difference between 120 and 121 = 1 which has the same difference as 110 and 111
what are nominal data?
-nonranked named categories
-no magnitude
no consistency of scale
-nominal variables are simply labeled-variables without quantitative characteristics
measuring height of the class. above 5ft and below 5ft. is this nominal?
yes. anything measured put into two categories is nominal. the term for this is dichotomous.
What is ordinal data?
- nominal data PLUS magnitude. no consistency of scale
- has three or more categories (nominal has 2)
ie. choosing a number on the pay scale
What is interval data?
-yes magnitude
yes consistency of scale
- . numerical scales with true units. counts frequency
ie. living siblings (number), and personal age (in years)
In summary concerning magnitude? conssitency of scale?
nominal
ordinal
interval
nominal = no, no
ordinal = yes, no
interval = yes, yes
can you take an interval data set and make it nominal or ordinal?
yes. measuring blood pressure.
ie. BP above or below 100 = nominal
ie. BP labels low, medium, high = ordinal
ie. BP recoreded as measured per person = interval
Can you make nominal and ordinal data into interval data?
no. you cut that shit out. you can’t go up the steps.
If a question is a YES NO question, what family is it?
nominal
What is the Levene’s test?
It is the test selected if the variance or spread of the groups in a data set are equal
What happens when handling interval data not normally distributed?
you cant choose a test on the interval sheet.
you choose a test off the ordinal sheet
4 key questions to selecting the correct statistical test: question #1
1-What DATA LEVEL is being recorded?
—-1a )does it have magnitude?
—–1b) does it have consistency spacing of scale
4 key questions to selecting the correct statistical test: question #2
2-What type of comparison/assessment i desired?
2a-correlation (will see the word CORRELATION
2b-regression (will see the words ASSOCIATIONSHIP or PREDICT outcomes and events)
2c-survival comparison (will see the phrase change over TIME)
2d-group comparison (will see the comparing groups and their data)
What is a correlation?
quantitative measure of the strength and direction of a relationship between variables
what is a partial correlation?
a correlation that controls for confounding variables
hint hint nudge nudge: If he were to draw a perfectly 1:1 negative correlation on the test what would the line look like?
45 degree downward slope line pointing down to the right
What is the name of the correlation test in each family?
can all correlations be run as a partial correlation? ….yes they can
- nominal = Contingency Coefficient
- ordinal = Spearman Correlation
- interval = Pearson Correlation
On the exam if you see data over time on the axis of a graph what do you choose>
survival tests
what sort of data family is # of “months”
interval
what sort of data family is what month are you born in?
ordinal
What is the name of the survival test in each family?
can all survival tests be shown by a Kaplan Meier curve?…..yes they can
nominal = Log-Rank test
ordinal= Cox-proportional Hazards test
interval = Kaplan-Meier Test
4 key questions to selecting the correct statistical test: Question #3
How many groups are being compared? 2 or 3 or more groups
4 key questions to selecting the correct statistical test: Question #4
Is the data INDEPENDENT or RELATED (PAIRED)? data from the same (paired) or different groups (independent)
buzzwords will see before/after or beginning/end
Name the Nominal tests for:
- 2 groups of independent data
- 3+ groups of independent data
- 2+ groups with EXPECTED cell count of <5
- Pearson’s Chi-square test
ie comparing board scores of 2020 and 2019
-Chi-Square test of Independence
ie comparing board scores of 2019 2020 2021
- Fisher’s Exact test
ie used when you sample is itty bitty small
What does a kappa test do?
what are +1, 0, and -1?
-a correlation test showing relationship or agreement between evaluators
ie 1+ = two observers of a study “classifying” the subject exactly the same way
0 = no relationship between observers’ “classifications
-1 = observers “classify” everyone exactly the opposite
What is type 1 error?
-rejecting Null hypothesis when the data is actually TRUE and should have accepted it
in other words you claim that there is a difference between groups when THERE IS NOT
What is type 2 error?
-Not rejecting Null Hypothesis when it is actually FALSE and you should have rejected it
in other words claiming there is no difference when in fact there is
What does the Null Hypothesis mean?
-There “aint” no difference between data sets
p value >0.05 (5%). Reject or don’t reject the Null hypothesis saying there is no difference between data sets?
-Do not reject the null hypothesis
the larger the sample size increases the power of a study. What is power?
-ability of a study design to detect a true difference if one truly exists
p-value measures what?
error rate
Any p-value below 0.05 (5%) is statistically significant and is worthy of rejecting the null thus claiming there is a difference. True/False
True
a p-value is the ___________ if you claim the data
chance of being wrong
If the confidence interval (CI) crosses some number and you hop a number to get to another number, that data is statistically NEVER significant.
whats an example?
starting at 1 i jump straight to 3 I jumped the number 2. NOT SIGNIFICANT
jumping from 1.2 to 1.5. SIGNIFICANT because both numbers are on the same side of the number 2.
0.4 to 0.5. SIGNIFICANT because both are on the same side of the number 1.