Lecture 6 - Biostatistics Part 2-2 Flashcards
Look at the table on slide 3… VERY IMPORTANT TO UNDERSTAND
Did you do it… seriously did you do it?
Measures of Central Tendency
Mean - the “average” – sum of the set divided by the number in the set
Median – the middle point (arrange the data smallest to largest, then find the middle point)
Mode – the score that occurs most frequently in a set of data
—–May have two most common values = “bimodal distribution”
quantifies the amount of variability, or spread, around the mean of the measurements.
To calculate: take each difference from the mean, square it, and then average the result
Variance (σ2 ):
a measure of variation of scores about the mean
Standard deviation (σ):
To calculate: take the √ of the variance
the “average distance” to the mean
Standard deviation (σ):
In practice, the standard deviation is used more frequently than the variance.
T or F?
True
Heterogeneous group vs homogeneous group?
When comparing two groups, the group with the larger standard deviation exhibits a greater amount of variability (heterogeneous) while the groups with smaller deviation has less variability (homogeneous).
Evaluates the strength of linear relationships or associations between variables
X increases and Y increases = positive correlation
X increases and Y decreases = negative correlation
0 is no correlation
Scatterplots
The statement that establishes a relationship between variables being assessed
Example: In a clinical trial the hypothesis states the new drug is better the placebo
Alternative hypothesis (Ha or H1)
The statement of no difference or no relationship between the variables
Example: In a clinical drug trial the null hypothesis states that the new drug is no better than placebo
Null hypothesis (Ho)
More important than p value – a better determination of significance
Confidence interval (CI)
Any statistic is simply an estimate of the true value of that statistic
Confidence interval (CI)
_____ produces a range within which the true value most likely lies
Confidence interval (CI)
95% CI states that we can be 95% certain that the “true” value is within the CI range
Meaning what is better?
Narrower CI is better
If the CI include 1 (null value) then the results are…
clinically insignificant
______ is used to separate from a large group of apparently well persons those who have a high probability of having the disease, so that they may be given a diagnostic work up, and if diseased can be treated.
A screening test
In general, screening is performed only when the following conditions are met:
The target disease is an important cause of mortality and morbidity.
A proven and acceptable test exists to detect individuals at an early stage of disease.
There is a treatment available to prevent mortality and morbidity once positives have been identified.
The proportion of people with the disease who have a positive test for the disease.
Sensitivity
The ability of the test to identify correctly those who have the test.
Sensitivity
– The proportion of people without the disease who have a negative test.
Specificity
The ability of the test to identify correctly those who do not have the disease
Specificity
A test with high _____ will not miss many patients who have the disease
sensitivity
A highly useful test when NEGATIVE
sensitivity
Screening test’s ability to identify presence of disease
Sensitivity
Tends to rule OUT the disease
Sensitivity
High ____ means low probability of false negative
Sensitivity
Screening test’s ability to truly identify absence of disease
That is, how likely is a negative test actually reporting the right answer?
Specificity
A highly useful test when it is POSITIVE
Specificity
Tends to rule IN the disease
Specificity
High _____ means low probability of false positive
Specificity
In practice, a compromise is reached and the cutoff point is set, leading to false-positives and false-negative results.
In a single test sensitivity may be increased but only at the expense of specificity, and similarly specificity may be increased at the expense of sensitivity.
In a single test sensitivity may be increased but only at the expense of ______
specificity, and similarly specificity may be increased at the expense of sensitivity.
A highly sensitive test is most useful to the clinician when it is
NEGATIVE
A highly specific test is most useful to the clinician when it is
POSITIVE
read slides 19-23 and make sure you can interpret the data correctly
it’s only your grade dummy
Sequential (Two-Stage) Testing
Allows us to calculate the net sensitivity and net specificity of using both tests in sequence. After completing both tests there is a loss in…..
net sensitivity and net gain in specificity.
in ________, a less expensive, less invasive, or less uncomfortable test is generally performed first, and those who screen positive are recalled for further testing with a more expensive, more invasive, or more uncomfortable test, which may have greater sensitivity and specificity. It is hoped that bringing back for further testing only those who screen positive will reduce the problem of false positives.
Sequential (Two-Stage) Testing
= proportion of patients who HAVE the disease and a positive test
Positive Predictive Value (PPV)
= proportion of patients who DO NOT HAVE the disease, and have a negative test
Negative Predictive Value (NPV)
Percent of patients with positive test who actually have the disease
Positive Predictive Value (PPV)
Assesses reliability of positive test
i.e. PPV 90% = positive test 90% of the time the test is correct
Positive Predictive Value (PPV)
With low prevalence (% of population) of disease:
Lower PPV
False positives increase
Less reliable positive test result
Percentage of patients with a negative test who actually do NOT have the disease
Negative Predictive Value (NPV)
Assesses reliability of a negative test
i.e. NPV 90% = negative 90% of the time the test is correct
Negative Predictive Value (NPV)
With low prevalence(% of pop) of disease :
Higher NPV
False negative test decreased
A negative test result is more reliable
________- the occurrence, rate, or frequency of a disease
Obtained from cohort studies
Must follow a cohort through time
Incidence
Incidence is obtained from?
cohort studies in which a cohort was followed throughout time
______the number of occurrences at one particular time
Obtained from cross-sectional studies
No time line, only a snap shot
Prevalence
_____ is obtained from cross-sectional studies (no timeline)
prevalence
slide 34…. Relationship between Incidence and Prevalence
understand it
How does a treatment that prevents death effect incidence or prevalence?
the incidences don’t increase or decrease, but the prevalence does because people aren’t dying
A method of predicting change in the dependent variable by changing one or more independent variables
Regression analysis
Allows the researcher to explore the relationship between two continuous variables
Regression analysis
What % of variation in the dependent variable can be explained by a change in the independent variable
Regression analysis
Example: when you GAIN 8 pounds, your SBP can be expected to increase by ~6 mmHg
Or, if you LOSE 8 pounds, your SBP can be expected to DROP by ~6 mmHg
Four types of Data:
Four types: Categorical ---Nominal ---Ordinal Continuous ---Interval ---Ratio
Categorical Data
_____ named categories with no implied order
—-Gender, race, ABO blood type, group
Nominal
Categorical Data
_____ sequenced or ranked data
—–Smallest to largest, lightest to heaviest, easiest to most difficult
Ordinal
Continuous Data
_____ intervals along the scale are equal to one another (i.e. integers)
Set on an underlying continuum that allows you to talk about how much higher one value is than another
0 on the scale does not mean the absence of the item (e.g., degrees Fahrenheit)
interval
Continuous Data
______ characterized by the presence of absolute zero on the scale
An absence of any of the trait being measured (e.g. weight)
Most precise
Ratio-
what is the most precise form of data?
A ratio of continuous data