Intro to Biostats in Epi Flashcards

1
Q
  1. Cite and describe the 3 attributes of study variables (data).
A

Order/magnitude

Consistency of scale / equal distances

Rational Absolute Zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. Cite and describe the 3 levels/categories of data measurement
A

Nominal: dichotomous and non-ranked named categories

Ordinal: ordered, ranked categories

Interval: equal-distance numerical scales / units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the difference between Ratio and Interval measurements of data. Give and example of each type of data measurement level

A

Ratio and Interval are nearly Identical, however Interval does NOT have a value that represents an “absolute zero”, while ratio data does

Ratio data example: “how much money do you earn per hour? (zero dollars = absolute zero)

Interval data example: “what is the temperature outside each day in December?” (zero degrees is NOT absolute zero bc it does not represent the absence of temperature)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

All statistical tests are based off of the ____ _ ____of the data that is being compared

A

level of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Compare and contrast the terms “discrete” and “continuous”as they relate to data measurement levels

A

discrete refers to Nominal and ordinal levels of data measurements

Continuous refers to the “equal distance in numerical scales” between categories in both interval and ratio levels of data measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

You ____ go down in the specificity/detail of data measurement level, however you ____ go back up in specificity/detail.

A

Can

Cannot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

researchers accept or dont accept the null hypothesis based on ____ _____.

A

statistical analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What data measurement level does a classic pain scale that is commonly used in healthcare settings, fall under?

A

Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The _______ level of data measurement is NEVER given in ranges, there will always be concrete numerical values

A

Interval/Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

True or False: Nominal data will not be given in categories because nominal data is always given in a dichotomous manner. Explain your answer

A

False

Nominal data can still be given in an unlimited number of categories, the categories simply cannot have any type of order or magnitude in relation to one another (hair color is a good example of this bc there is no magnitude amongst the categories)

Dichotomously recorded data is an indicator for Nominal data, however that does not mean that all nominal data MUST only have 2 categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

give the definitions of Mean, Median, and Mode. explain the effect that an outlier would have on their value.

A

Mean: the average of the data (outliers affect the mean)

Median: the calculated “middle” of the data range (outliers affect the median)

Mode: the most repeated number in the data range (outliers DO NOT affect this)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe what the IQR of a data set is

A

IQR = Interquartile Range

the middle 50% of the data values

25% on either side of the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What 2 calculated values describe the dispersion/spread of a data set?

A

Variance and Standard Deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define Variance and Standard Deviation

A

Variance: the average of the squared-differences in each individual measurement value and the group’s mean

Standard Deviation: (SD) 68%, 95%, and 99.7% are 1, 2, and 3 deviations of a data set respectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

State how to determine a positively or negatively skewed data set using graphs OR given values

A

Positively skewed: When the mean is greater than the median OR a graph with a tail pointing to the right

Negatively skewed: When the mean is less than the median OR a graph with a tail pointing to the left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or False: a positive skewness value means that the data set is positively skewed. explain

A

False

the value being positive determines that there is a type of skew of the data set, however you would have to graph the data/interpret it’s mean and median values in order to determine if the skew is positive or negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Compare and contrast skewness value and kurtosis

A

Skewness: a measure of the asymmetry of a distribution

A perfectly normal distribution (ideal bell curve) would have a skewness value of 0

Kurtosis: a measure of the extent to which observations cluster around the mean

A perfectly normal distribution (ideal bell curve) would have a kurtosis value of 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Describe what positive and negative kurtosis values indicate

A

Positive Kurtosis: means there is a more dramatic clustering presentation of the data

Negative Kurtosis: means there is a less dramatic clustering presentation of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q
  1. List the percentages of a populations’ data comprised within 1, 2 and 3 standard deviations (SDs) around the mean of a normally distributed dataset.
A

1 SD = 68%

2 SD’s = 95%

3 SD’s = 99.7%

20
Q

Stats test useful for normally distributed data are called _____ tests.

A

parametric

21
Q

Most studies are set up to achieve what percentage of power?

A

80%

22
Q

The following are all keywords for what type of data?

Pre vs post ; before vs after ; baseline vs end

A

paired or related data

23
Q

What is the function of “survival” tests? list the tests that are considered survival tests

A

Survival tests compare the proportion of events ever time or time-to events between groups?

(includes the log-rank, cox-proportional hazards, and Kaplan-meier tests)

24
Q

What is the function of regression tests? list the tests that are considered to be be regression tests

A

regressions provide a measure of the relationship between variables by allowing the prediction about the dependent variable (DV) using the known values of independent variables

you can also calculate an OR from regression tests

(includes Logistic regressions, multinominal logistic regressions, and linear regression tests)

25
Q

Compare null and alternative hypotheses

A

Null hypothesis: a research perspective that states that there will be NO true difference between the groups being compared

Alternative hypothesis: a research perspective that states that there WILL be a true difference between the groups being compared

26
Q

define p value and how you decide if it is statistically significant or not

A

The p value describes the likelihood of committing a type I error if the null hypothesis is rejected

If a p value is lower than the preselected alpha value (almost always 0.05) then it is considered to be statistically significant

27
Q

Define type 1 errors and give an alternative name for it.

A

Type 1 error: rejecting the null hypothesis when you should have accepted it (there really is no true differences between the groups, but you incorrectly state that there is a difference by rejecting the null hypothesis)

alpha error ; false positive

28
Q

Define type 2 errors and give an alternative name for it.

A

Type 2 error: accepting the null hypothesis when you should have rejected it (there really IS a true difference between the groups, but you incorrectly state that there is not a true difference between the groups by accepting the null hypothesis)

beta error ; false negative

29
Q
  1. Delineate the common elements utilized in determining sample size of a study.
A

The minimum difference between groups deemed significant (the smaller the difference between the groups that is deemed to be significant, the greater the sample size needed)

Expected variation of measurement (known/estimated from past studies)

Type 1 and Type 2 error rates and confidence interval (usually ranges from 90% to 95%
Add in anticipated drop-outs or loss to follow up

30
Q
  1. Describe how sample size affects power and the ability to detect a difference between populations, if a difference truly exists.
A

The level of power that study has is directly proportional to the sample size (the larger the sample size, the greater the ability that study will have to detect a difference IF there is in fact one present)

31
Q
  1. Define the differences between parametric and non-parametric statistical tests, listed below, and cite which tests fit under each of these two categories.
A

Parametric statistical tests are effectively applied to normally-distributed data sets that are “normally distributed” and have “equal variances”
(INTERVAL)

Non-parametric statistical tests do not require that the data be normally distributed
The data can also be transformed to a standardized value (z-score or log transformation) in an attempt to present that data in a more normally-distributed manner
(NOMINAL AND ORDINAL)

32
Q
  1. Define the difference between independent and paired (repeated measures) data measurements and comparisons.
A

Independent Data is collected from different groups ; no group has the same data measurement conducted more than once

Paired (related) data is a data measurement that is collected from the same group more than once ; the same group has the same data measurement taken more than once

The same data measurement may be conducted more than once, however if it is being measured on another group, it is considered independent data still.

33
Q
  1. Cite the statistic performed to assess consistency and agreement, within and between individual investigators/evaluators. interpret the values of +1, 0, and -1 for this statistic.
A

The Kappa statistic is a correlation test that shows the relationship or agreement between evaluators

Kappa Interpretation
+1 means the observers PERFECTLY classify everyone the exact same way

0 means there is no relationship at all between the observer’s classifications, therefore the differences exhibited between observers is left completely up to chance

-1 means that the observers classify everyone exactly the OPPOSITE of each other

34
Q

When you see the phrase “mean length of time” what level of data measurement should you be thinking?

A

interval

35
Q

When the levene’s test yields a p value of _____, what do you need to do? (fill in the blank and answer the question)

A

less than 0.05

go down a level of data measurement

36
Q

When you see the phrase “time-to-event” what level of data measurement should you be thinking?

A

Interval (bc it is a numerical value that measures time ex. days, hours, minutes, etc)

37
Q

If you are working with interval data, and the data is ____ evenly distributed, what must you do? (fill in the blank and answer the question)

A

NOT

you must “step down” a level of data measurement to ordinal

38
Q

The buzzword “between” points to _____ and the buzzword “within” points to _____.

A

independent data

related data

39
Q

Define power

A

the statistical ability of a test to detect a difference between groups IF there is one present

40
Q

What is the purpose of the levene’s test?

A

to decide if the data is normally distributed and has equal variances (or not)

41
Q

If you are trying to calculate the exact range for 1 standard deviation, how exactly would you calculate it if you are provided the mean, median, and SD for the data set?

A

for 1 SD: add and subtract the SD value from the mean in order to find the upper and lower ranges

(for 2 or 3 SD’s, simply subtract and add the SD the appropriate number of times)

42
Q

a p value is determined based on the probability of observing, ___ _ ____ ____, a test statistic value as extreme or more extreme than actually observed if groups were similar.

A

Due to chance alone

43
Q

Confidence Intervals are calculated based on what 2 factors?

A

Variance in the sample (V/SD)

Sample Size (N)

44
Q

State the numbers that a CI must “cross” in order to be deemed non significant, for the following data types

Ratios (OR,RR,HR):

Absolute differences:

A

Ratios (OR,RR,HR): 1.0

Absolute differences: 0.0

45
Q

State the 3 crucial pieces of information that need to be included when correctly interpreting a CI value.

A

Which group is being compared to whom

Direction of the ratio (greater or less than the comparison group) and its Magnitude

State whether or not it is statistically significant