Biostats_4_Measures of Association Flashcards
Precision takes into account a measurement’s (or set of measurement’s) … ?
Reliability
vvvvvvvvvvvvvv
The consistency and reproducibility of a test.
The absence of random variation in a test.
vvvvvvvvvvvvvv
Reliability refers to how similar the data points are to each other: when reliability is low, the data points are more widely dispersed. When reliability is high, the data points are more close together.
What does the precision do to the standard deviation (SD)?
SD decreases when the measurements are more precise
Accuracy takes into account a measurement’s (or set of measurement’s) … ?
Validity
vvvvvvvvvvvvvv
The closeness of test results to the true values.
The absence of systematic error or bias in a test.
vvvvvvvvvvvvvv
Validity refers to how close the data points are to the true value: when validity is low, the data points do not approximate the true.
An analysis that renders values such as these would have (high/low) precision/accuracy?
Low reliability and High validity
An analysis that renders values such as these would have (high/low) precision/accuracy?
High reliability and Low validity
An analysis that renders values such as these would have (high/low) precision/accuracy?
Low reliability and Low validity
An analysis that renders values such as these would have (high/low) precision/accuracy?
High reliability and High validity
Random error will impact the … ?
precision in a test.
Systemic error will impact the …?
accuracy in a test.
Specificity and sensitivity would relate to precision or accuracy?
Both of these measures (using standardized values) are of tests of validity and refers to the ability of a test to correctly identify those who do not have a certain disease (specificity) or the ability of a test to correctly identify those who have the disease (sensitivity).
What is Attributable Risk (AR)?
The excess incidence of a disease due to a particular factor (exposure). This measure of association is used in cohort studies. AR is also known as the ‘risk difference’ and is the absolute value in terms of risk between the exposed and unexposed groups.
Formula: AR = Incidence in Exposed - Incidence in Unexposed.
What is another term for attributable risk?
Absolute risk increase
What does Attributable Risk (AR) measure?
Excess risk due to exposure in the exposed group.
vvvvvvvvvvvvvv
The absolute risk attributable to exposure in the exposed group.
vvvvvvvvvvvvvv
Calculated as: incidence rate in the exposed group - incidence rate in unexposed group
How is Attributable Risk (AR) calculated?
AR = | (Incidence in Exposed) - (Incidence in Unexposed) |
For example, 100 people are analyzed.
60 were exposed and 40 were not exposed.
In the exposed group, 50% of the members experienced disease (30 out of 60).
In the unexposed group, 25% of the members experienced disease (10 out of 40).
The AR = (30/60) - (10/40) = 0.5 - 0.25 = 0.25 (or 25%).
If a study needed to determine how much an exposure or risk factor has contributed to the incidence of a disease, and the relative risk was provided, what measure of assoication would be appropriate and how would this be calculated?
This would require the Attributable Risk Percent (AR%), which is the proportion of disease incidence in the exposed group attributable to the exposure.
Formula: AR% = [ ( RR - 1) / ( RR ) ] x 100 = (Attributable Risk / Incidence in Exposed) × 100
How is Attributable Risk Percent (AR%) calculated?
AR% = ( [Attributable Risk] / [Incidence in Exposed] ) × 100.
AR% = [ ( RR - 1) / ( RR ) ] x 100.
What is Population Attributable Risk Percent (PAR%)?
The proportion of disease incidence in the total population attributable to the exposure.
In order to determine this, the incidence of the disease within the entire population (irrespective of whether they were exposed to the risk factor) is subtracted by the incidence of developing the disease in the unexposed group (which is assuming random chance of developing the disease). That value is then placed within a ratio to the entire population’s incidence where the numerator is the value obtainted from subtracting out the random chance and the demoninator is the incidence of the entire population.
How is Population Attributable Risk Percent (PAR%) calculated?
To determine population attributable risk percent:
1) First calculate the incidence of the disease in the study population as a whole. For example, if a study population of 100 people (where 60 were smokers and 40 were non-smokers) had 30 individuals from the smoker group and 10 individuals from the non-smoker group who developed respiratory disease or symptoms, then the overall incidence of developing respiratory disease or symptoms in this study population would be 40/100.
2) Next, calculate the difference in risk of developing respiratory disease among the study population as a whole and among non-smokers (40/100 - 10/40 = 0.4 - 0.25 = 0.15). To explain this further, 40/100 accounted for the incidence of developing disease or symptoms in the entire population while 10/40 was the risk based on random chance.
3) Divide the difference in risk between the two groups by the incidence of respiratory disease in the population as a whole (0.15/0.4 = 0.375) to determine that Based on the calculation, 37.5% of the yearly respiratory disease in the study population is attributable to smoking.
PAR% = [(Incidence in Total Population - Incidence in Unexposed) / Incidence in Total Population] × 100.
What is Number Needed to Treat (NNT)?
The number of patients that need to be treated to prevent one additional adverse outcome.
Formula: NNT = 1 / Absolute Risk Reduction (ARR).
How is Number Needed to Treat (NNT) calculated?
1) First determine the absolute risk reduction, ARR = (Mortality Rate in Control - Mortality Rate in Treatment).
2) Then take the reciprocle of this value.
For example if a new treatment regimen now has a death rate of 25/50 = 0.5 over 5 years, whereas in patients kept on the conventional regimen had a mortality rate of 75/100 = 0.75, then the absolute risk difference between the two groups would be 0.75 - 0.5 = 0.25. Taking the reciprocal (1/0.25 = 4) of the absolute risk difference allows for the NNT to be determined.
Example: = 0.75 - 0.5 = 0.25; NNT = 1 / 0.25 = 4.
Based on this result, we can conclude that we need to treat 4 patients with the new regimen as opposed to the conventional regimen in order for one more patient to survive 5 years without relapse.
NNT = 1 / Absolute Risk Reduction (ARR)
What insight does Number Needed to Treat (NNT) provide?
Practical insight into the effectiveness of a treatment.
What does Attributable Risk Percent (AR%) show?
The proportion of disease incidence in exposed individuals that is due to the exposure.
What does Population Attributable Risk Percent (PAR%) demonstrate?
The impact of exposure on disease incidence in the entire population.
How is a normal distribution set?
1 sd = 68 % of all values (+/- 1 sd from the mean is +/- 34%)
2 sd = 95% of all values (+ / - 2 sd from the mean is +/- 14%)
3 sd = 99% of all values (+ / - 3 sd from the mean is +/- 2 %)
What is used to determine the accuracy of the mean?
The likelihood of the estimated mean to be accurate is “standard error of the mean (SEM)”
vvvvvvvvvvvvvvv
The standard error of the mean is a specific kind of standard deviation: while SD describes the dispersion of sample data in relation to its mean, SEM describes the dispersion of means of different samples from a population mean. As the SD increases and the sample size decreases, SEM will increase.
Would increasing the amount of measurements alter the standard deviation?
No, the standard deviation measures the dispersion or spread in data and is an intrinsic property of the population from which the sample is drawn. Increasing the sample size may increase the accuracy of estimating the standard deviation, but it will not change the standard deviation itself.
Would increasing the amount of measurements alter the standard error of the mean?
Yes, the standard error of the mean (SEM) is a measure of the dispersion of a random set of sample means around the true population mean. It is dependent on the variability (i.e., standard deviation) of the measured values and the sample size (SEM = SD/√n). By increasing the sample size, the sample means approach the true population mean, resulting in a smaller SEM.
How would a larger standard deviation alter the standard error of the mean?
A greater standard deviation will increase the SEM, resulting in a less accurate estimate.
How would a smaller sample size affect the standard error of the mean?
A smaller sample size will increase the SEM, resulting in a less accurate estimate.
When a sample is measured and population mean is then subtracted from the sample measurement and this result is then divided by the standard deviation, what is this value if we assume that all measurements follow a normal distribution?
Z-score
vvvvvvvvvvvvvvv
This value is used to express data in terms of units of standard deviation and how many standard deviations from the mean a particular value is is represented in its value. With a z-score a research can compare values between other populations with different means and standard deviations.
How are confidence intervals (CIs) calculated?
CIs are defined as the mean ± standard error of the mean, which is calculated by multiplying a Z-score (for 95% confidence intervals this is always 2) by the standard deviation (SD) divided by the square root of the sample size (Mean +/- Z-score (SD/√sample size)). A larger sample size or a decreased SD (based on data precision) will decrease the standard error. Including more disparate data or reducing the sample size, will increase the standard error, thus expand the CI.
The average of the squared differences of values in a data set from the mean value is … ?
Variance
vvvvvvvvvvv
The variance allows interpretation of how far a set of data is spread out. A variance of zero means that there is no variability in the values. Largely different numbers in a set of data lead to a large variance.
What is the functional use of dividing the standard deviation by the mean?
The coefficient of variation (CV)
vvvvvvvvvvvvvvvv
Used to measure and compare the dispersion around the mean of multiple data sets.
vvvvvvvvvvvvvvvv
CV, which is a statistical relative measure of dispersion, allows SD to be interpreted relative to the mean, allowing for the comparison of multiple data sets that may have means of different magnitudes or units of measurement, as CV is dimensionless. A high CV indicates that values are widely spread around the mean.
How does the CV differ from the SD?
SD, which is an absolute measure of dispersion, describes the variability of data in relation to the mean within a single data set. CV is a relative measure of dispersion.
In both positively and negatively skewed distributions, what is the new “peak?”
The mode becomes the apex of the curve.
vvvvvvvvvvv
For the positively skewed distributions, the mode (peak) is displaced to the left.
vvvvvvvvvvv
For the negatively skewed distributions, the mode (peak) is displaced to the right.
With negatively skewed distributions, the mean is displaced to the ______
left
vvvvvvvvvvv
The Peak is to the right and the tail is to the left.
vvvvvvvvvvv
Going left to right is mean (“-“ tail), median, mode (apex).
In a negatively skewed distribution, which (mean, median, mode) is greatest?
Mean < median < mode