Katie EBM Midterm Flashcards
Descriptive Statistics
The presentation, organization and summarization of data
Frequently used graphical displays
- study design flow chart
- KAplan-Meier estimators
- Forest Plot (line down the center divides 2 treatment arms)
- Line graph
- Histogram
Inferential Statistics
allows researchers to generalize from our sample of data to a larger group or population
Key variables of Inferential Statistics
- Sample size (larger better)
2. Standard deviation (Smaller better)
Dependent Variable
The outcome of interest (changes in response to intervention)
Independent Variable
The intervention (what is being manipulated by the researcher)
Discrete variable
- Variables that can only take on a finite number of values.
- All qualitative variables are discrete
- Some quantitative variables are discrete, such as performance rated as 1,2,3,4, or 5, or temperature rounded to the nearest degree
continuous variable
may take any value, within a defined range
Nominal Data
used for labeling variables, without any quantitative value. “Nominal” scales could simply be called “labels.”
“nominal” sounds a lot like “name” and nominal scales are kind of like “names” or labels
ex: male vs female or hair color
Ordinal Data
The order of the values is what’s important and significant, but the differences between each one is not known. Typically measures of non-numeric concepts like satisfaction, happiness, discomfort
ex: very unsatisfied, mildly unsatisfied, neutral, mildly satisfied, very satisfied
Interval Data
- Numeric scales in which we know the order and also the exact differences between the values.
- The classic example of an interval scale is Celsius temperature because the difference between each value is the same
- No “true zero.” For example, there is no such thing as “no temperature.” Without a true zero, it is impossible to compute ratios. With interval data, we can add and subtract, but cannot multiply or divide
Ratio Data
Tell us about the order, the exact value between units, AND they also have an absolute zero–which allows for a wide range of both descriptive and inferential statistics to be applied
Example: weight or height
Proportion
type of fraction in which the numerator is a subset of the denominator
Rate
fraction that contains a time compnent
Percentage
a form of proportion where the denominator is artificially set to 100
Central Tendency
a central or typical value for a probability distribution
Mean
Measure of central tendency for interval and ratio data
Median
Value such that half of the data points are above and half are below
Mode
most frequently occuring catergory
Steps in Appraising The Evidence About Therapy
- Validity (can I trust the information)
- Important (Will the information, if true, make an important difference?)
- Applicability (Can I use this information?)
Validity
- Are the groups balanced?
- Were the groups randomized?
- Was randomization concealed?
- Did experimental and control groups begin with similar prognosis?
- To what extend was the study blinded?
- Was follow-up complete?
Importance
How large was the treatment effect?
How precise was the estimate of the treatment effect?
Applicability
Patients like yours?
Benefits worth the harms and costs?
Confounding variable, Confounder
a factor that distorts the true relationship of the study variable of interest by virtue of also being related to the outcome of interest
Selection Bias
systemic differences between comparison groups attributable to the manner in which subjects were allocated to experimental and control groups
Contamination
subjects in either the experimental or control group receive part or all of the intervention intended for the other arm of the study
Expectation Bias
awareness of or information about the intervention influences participatns expectations regarding results and outcomes
Key Concepts about appraising the evidence:
- the size and precision of the treatment effects determines the importance of the results
- who was enrolled and what was measured are the most important determinants of applicability
Why is normal distribution important?
- Many statistical tests assume normal distribution
- the mean and variance are independent
- It’s held that many natural phenomena are normally distributed
- Central Line Theorem
Central Line Theorem
if we draw equally sized samples from a non-normal distribution, the distribution of the means of these samples will still be normal as long as the samples are large enough
How large is large enough for central line theorem sample size?
30?
Standard Score (more commonly referred to as Z-score)
Very useful statistic because it:
(a) allows us to calculate the probability of a score occurring within our normal distribution and
(b) enables us to compare two scores that are from different normal distributions.
How to calculate Z-score
(raw score- mean)/standard deviation
Properties of Normal Curve
- The mean, median and mode all have the same value
- The curve is symmetric around the mean
- The tails of the curve approach but never cross x-axis
- theoretic, not realistic
Confidence Intervals
The range of numerical values in which we can be confident (to a computed probability, such as 90 or 95%) that the population value being estimated will be found. Confidence intervals indicate the strength of evidence; where confidence intervals are wide, they indicate less precise estimates of effect
Precision
The range in which the best estimates of a true value approximate the true value
Larger group effects on CI and precision
larger groups= smaller CI, higher precision
Smaller group effects on CI and precision
smaller groups= larger CI, less precision
Probability
deals with the relative LIKELIHOOD that a certain event will or will not occur, relative to some other events
Empirical Probability
- based on past performance, holds true now and in future only under similar circumstances
- if the circumstances have changed, than the probabilities no longer exist
- accounts for most probabilities in medicine
Mutually Exclusive Events
the occurrence of one event is not influenced or caused by another event. It is impossible for mutually exclusive events to occur at the same time
ex: coin toss heads/tails
Conditionally Probable Events
the probability of an event ( A ), given that another ( B ) has already occurred
-calculated using multiplicative law
Independent Events
The probability that one event occurs in no way affects the probability of the other event occurring.
example: roll a die and flip a coin
Intention to Treat
A method for data analysis in a randomized clinical trial in which individual outcomes are analyzed according to the group to which they have been randomized, even if they never received the treatment they were assigned.
By simulating practical experience it provides a better measure of effectiveness
ITT: what does a high rate of noncompliance lead to?
underestimate of effectiveness
ITT Pros
- preserves sample size
- reflects the practical clinical scenario
- gives an unbiased estimate of TX effect
- Limits analysis based on arbitrary subgroups
- Minimizes risk of Type 1 error (false positive)
ITT cons
- estimate tx effect generally conservative because of dilution of noncompliance
- heterogenity may be introduced into the RCT if compliant, noncompliant and dropouts analyzed together
- susceptible to Type II error (false negative)
Null Hypothesis
States that there is no difference
Researched tries to disprove null hypothesis
Type 1 Error
Incorrect rejection of a true null hypothesis
“False Positive”
-this is less likely using ITT analysis (because ITT is a conservative approach)
Type 2 Error
Failure to reject a false null hupothesis
“False negative”
-occurs when analysis is too cautious
ITT methodology
- “once randomized, always randomized”
- ignores noncompliance, withdrawal, anything that happens AFTER randomization
Alpha Level
The probability of a type I error
Beta Level
Probability of a Type 2 error
P-Value
- helps you determine the significance of your results
- small p-value (typically ≤ 0.05) indicates strong evidence against the null hypothesis, so you reject the null hypothesis
- A large p-value (> 0.05) indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis
Absolute Risk (AR)
-The observed or calculated probability of an event in the population under study
How to calculate: the number of events in treated or control groups, divided by the number of people in that group (a/b, c/d)
Absolute Risk Reduction (ARR)
the difference in the rates of adverse events between study and control populations
How to calculate:
(AR of control group) - (AR of treatment group)
Relative Risk
The ratio of risk in the treated group to the risk in the control group
How to calculate:
(AR of control group)/(AR of treatment group)
Relative Risk Reduction
the percent reduction in events in treated compared to controls
How to calculate:
((AR control group) - (AR treatment group)) / (AR control group)
Number Needed to Treat
1/absolute risk reduction
Specificity
if the test result for a highly specific test is positive you can be nearly certain that they actually have the disease (Ex: gallop murmur= CHF)
100% specific=
positive= has disease!
Sensitivity
among patients with disease, the probability of a positive test
Sensitive test when Negative rules Out disease (ex: neg D-dimer= No PE)
100% sensitive=
Negative= doesn’t have disease!
Positive Predictive Value
the probability that subjects with a positive screening test truly have the disease
Negative Predictive Value
the probability that subjects with a negative screening test truly don’t have the disease
The Law Of At Least One
used to determine the probability of at least one event occurring
Lessons to Remember in regards to “law of at least one”
1) There are no perfect tests
2) the more tests you run, the higher the rate of at least one erroneous result
3) limit the number of tests that you order
Binomial Distribution Definition
the probabilities for dichotomous variables
(ex: coin toss, mortality)
Unlike normal distribution because normal distribution is based on a continuous variable (such as blood pressure)
Size of Binomial Distribution
The larger the sample size (“n) the more the binomial distribution shifts to the right. The more it shifts to the right, the more closely is resembles the normal distribution.
-larger sample sizes, even though dichotamous, can use the Z-score for calculations and normal curve for probabilities