drug lit exam 1 - bio stats Flashcards

1
Q

variables:

Determine if a variable is nominal, ordinal, interval, or ratio

Recognize dichotomous endpoints

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variables

A

Variable:
- Anything that can be observed or measured in a clinical experiment

Dependent Variable:
- The outcome of interest
- What should change as a result of the researcher’s intervention

Independent Variable
- The researcher’s intervention
- What is being manipulated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Types of Variables

A

Discrete Data
- Can only be whole numbers
- Example: you can’t have 2.13 children

Continuous Data
- Can take any value, within a defined range
- Example: you can divide BP mmHg into tenths of a mmHg, hundredths, even thousandths!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Another way to think about variable types…

A

Nominal
- Different categories, in no particular order

Ordinal
- Ordered categories, where the distance between categories cannot be considered equal

Interval
- Equal distances between values, but the zero point is arbitrary (not the same for each variable)

Ratio
- Equal distances between values, with a meaningful zero point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Variable Examples

A

Nominal
- No category is “higher” or “better” than others
- Every study participant in a sample will be placed into one of the categories
Also referred to as dichotomous when there are 2 options

Examples
- Medical diagnoses (“Diabetes”; “No diabetes”)
- Race or Nationality (“Asian”; “African”; “European”)
- Age groups (“< 18 years”; “18-44 years”; “> 44 years”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Variable Examples

A

Ordinal
- There is ordering of these values, but the distance between values is not equal

Examples:
- Excellent/Satisfactory/Unsatisfactory
- Likert Scales (strongly agree, agree, neutral, disagree, strongly disagree)
- Cancer Stages I – IV
- The order of finishing a race

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variable Examples

A

Interval
- Ordering of values and equal distance between values
- The zero point isn’t meaningful, and therefore can be changed

Example
- Temperature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variable Examples

A

Ratio
- Ordering of values and equal distance between values
- The zero is meaningful

Examples
- Weight (kg/lbs)
- Height (cm/inches)
- Blood pressure (mm Hg)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variable Type & Assumptions

A

Nominal
Named categories

Ordinal
Same as nominal plus ordered categories

Interval
Same as ordinal plus equal intervals

Ratio
Same as interval plus meaningful zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Learning objective for descriptive stats

A

Given a mean and standard deviation of a normally distributed sample, calculate the range that 95% of the data points fall between

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Two categories of statistics

A

1) Descriptive Statistics

  • Used for presenting, organizing, summarizing data
  • Can summarize your data set with just a few key numbers

What you need to know about:
Mean
Median
Mode
Interquartile range
Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Two categories of statistics

A

2) Inferential Statistics

Used to generalize data from a sample to a larger population
Used to identify “statistically significant” differences

Examples
Student’s t-test
Chi Squared
ANOVA

Understanding when and how to use these statistics won’t be a focus for the biostatistics lectures in this course

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Measures of Central Tendency

A

Mean:
The average value
Sum all values and divide by number of values (N)
AKA the “typical value”
Only okay to use to describe interval and ratio data!
Affected by outliers

If a study reports the mean for ordinal data, critique that as bad statistics!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

median

A

The middle value

The 50th percentile

Arrange all values from smallest to largest and pick the middle number

Used to describe ordinal data (interval and ratio are okay too)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

mode

A

The most frequently occurring value or category

Used to describe nominal data (interval, ordinal, and ratio are ok too)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Quick Quiz! Calculate the mean, median, and mode for this dataset

7, 4, 2, 4, 8

A

Mean = 7+4+2+4+8 = 25/5 (n) = 5
Median = middle value: 2, 4, 4, 7, 8 = 4
Mode = 4 (most frequently occurring)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which data set has the largest mean?

A

none of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Measures of Dispersion

A

How closely the data cluster around the measure of central tendency

Range:
The difference between the highest and lowest value
Measures the variability of the data
Advantage: simple to calculate and understand
Disadvantage: affected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Interquartile Range

A

The interval between the 25th and 75th percentiles

The middle 50% of values

A measure of variability

Directly related to the median

Advantage: Not affected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard Deviation (SD)

A

Very common estimate of data variability

Estimates the scatter of data points about the sample mean

Often necessary when running inferential statistics

68% in 1 SD
95% in 2 SD
99% in 3 SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

statistically significant

A

p value less than 0.05

22
Q

Learning objectives for p value alpha typ1 I error

A

Interpret a p value, with respect to the alpha of the study and risk of type I error

Given clinical study results, identify any statistically significant differences between groups

23
Q

Clinical studies

A

Assume that an intervention (e.g., medication) has an effect

Clinical study is an experiment to see what the effect of the intervention is:
Based on the results of that one experiment, can we apply those results to the overall patient population??

Hypothesis testing:
Null hypothesis
Alternative hypothesis

24
Q

Hypothesis Testing

A

Research Hypothesis (H1)
- The treatment (or intervention) has an effect on the experimental group
- Effect seen by comparing the experimental group to the control group

Null Hypothesis (H0)
- Any difference seen between the experimental and control groups is due to chance alone
- The intervention does not have an effect
- The hypothesis we are actually doing statistics on

25
The role of chance
Imagine you want to see if the true chance of getting heads on any single toss = 0.5 (50%) Toss coin 10 times Would you expect to see exactly 5 heads and 5 tails?
26
Type I error
Inappropriately concluding that there is a true difference between 2 study groups when the difference is due to chance alone A “false positive” Components of a study that are prone to type I error Subgroup analysis (especially post hoc) Secondary endpoints
27
Alpha (α)
The acceptable probability that the difference between study groups is due to chance alone Accepted by researchers at the start of the study (before the experiment happens) Usually 0.05 (5%) in clinical research - The difference between study groups has less than a 5% probability of occurring due to chance alone
28
P value
The actual probability that the difference between 2 study groups is due to chance alone at the end of the study (after the experiment happens) When P < alpha, the difference seen is statistically significant - (The probability that the difference seen is completely random is < 5%) Sometimes referred to as a positive study result
29
Learning objectives for beta type II error
Explain how various factors can affect the power of a study Demonstrate understanding of the relationship of beta to power and type II error
30
Preclass online lecture Part 3: P values, alpha, and type I error
Focused on the importance of making sure that a study doesn’t inappropriately conclude that there is a true difference between study groups when the difference seen is due to chance alone Remember the type of error? - Type I error What about an error when there IS a true difference between study groups but the study fails to detect that difference? - Type II error
31
Type II error
The researchers aren’t able to find a difference between the experimental and control groups, but a difference really exists Failing to show statistical significance, when there really is a true difference between study groups Falsely concluding that any difference seen between study groups is due to chance alone A “false negative” result
32
Beta (β)
The probability of making a type II error Determined during study design phase Can be anything, often between 5-20% (β = ___ - ___)
33
Study “power”
The chance that if a true difference exists, it will be successfully detected The probability of not making a type II error Power = 1 – β memorize this equation! Determined during study design phase by doing a power calculation
34
What things can affect the power of a study?
Sample size (n) - As sample size increases, power increases The effect size/event rate - As effect size or event rate becomes larger, power increases The duration of the study - As study duration increases, power increases A type II error may occur - If a study fails to enroll enough patients - If the researchers overestimate the size of the treatment effect - If the study is terminated early Any of these things can result in a study being underpowered
35
Age (years) A. Dichotomous – means two options B. Continuous – can be a whole or decimal number
B. Continuous – can be a whole or decimal number
36
Experiencing one or more hospitalization(s) A. Dichotomous – two options B. Continuous – decimal or whole #
A. Dichotomous – two options because the patient experienced one or more hospitalizations so the answer is yes or no which makes it dichotomous
37
Time (hours) A. Dichotomous B. Continuous
B. Continuous
38
Type of variable: Stage of cancer (0, 1, 2, 3, 4) A. Nominal B. Ordinal C. Interval D. Ratio
Ordinal - still categories, because it is in a particular order but do not have a set amount of spacing between the different stages and the stages are not on a number line also can be a class of heart failure or order of finishing a race
39
Type of variable: duration of diabetes (mean # years) A. Nominal B. Ordinal C. Interval D. Ratio
nominal or ordinal are categories so it is either they have diabetes or they do not but this is the mean # years so we can rule out A or B interval data there is an arbitrary zero in the data but the ratio and ratio has a true zero so is the true zero mean that the duration of diabetes means you never had diabetes answer: ratio, data is reported as SD or mean if reported as an n then it is usually nominal or ordinal
40
Which data set has the largest mean? A. A B. B C. C D. A = B = C
D. A = B = C
41
Which data set has the largest standard deviation? A. A B. B C. C D. A = B = C
C. C
42
In the data set below, which is the most appropriate measure of dispersion? (Example: days of hospitalization in a sample of 8 patients) - which gives a better spread for the people in the data 1,2,2,3,3,4,5,90 A. Range B. Interquartile range C. Both are appropriate
B. Interquartile range there is an outlier that we wan to get rid of so we use the IQR
43
Given the following data, what are the lower and upper fasting glucose values that 95% of the patient sample would fall between? A. 80, 120 B. 90, 110 C. 97, 103
need mean +/- 2 SD for 95% A. 80, 120
44
A study finding that is statistically significant will always be clinically significant - means something to the clinician that they can use True False
true
45
Which of the following endpoints were statistically significant (α = 0.05)? A. Death B. Cardiac arrest C. Stroke D. Hospitalization E. Both 2 and 4
E. Both 2 and 4 because both of its p-values are less than 0.05 for death: it is not statistically significant because the p-value is greater than 0.05 for cardiac: the rhythm control group had a higher chance of having cardiac arrest, it is statistically significant because the p-value is less than 0.05 for stroke: the rhythm control group had a higher chance of having a stroke, it is not statistically significant because the p-value is greater than 0.05. there is a 79% probability that the difference we are seeing is due to chance alone for Hospitalization: the rhythm control group had a higher chance of having a Hospitalization, it is statistically significant because the p-value is less than 0.05. there is a 0.001% probability that the difference we are seeing is due to chance alone
46
What is the “alpha” of a study? A. Acceptable limit to the probability of making a type I error B. Acceptable limit to the probability of making a type II error C. Risk of false positive D. Risk of false negative E. Both 1 and 3 F. Both 2 and 3
E. Both 1 and 3 usual alpha: 0.05 relates to a type I error so that is why it is also a false positive
47
Which of the following describes a clinical situation consistent with type II error (false negative)? A. A patient is diagnosed with cancer but does not really have cancer B. A patient is diagnosed with cancer free, but really does have cancer C. A patient is diagnosed with cancer and really does have cancer D. A patient is diagnosed as cancer-free and does not really have cancer
B. A patient is diagnosed with cancer-free but really does have cancer type II error: false negative so not finding something that is actually there A. would be type I error this is a false positive
48
If a study has the power of 85% to detect a difference, what is the probability of type II error? A. 85% B. 0.85% C. 15% D. Unable to calculate
C. 15% power = 1 beta
49
What assumptions did the researchers make when doing their power calculation? The study was designed to have a power of 89% to detect a 15% reduction in the rate of primary endpoint (death/MI/stroke) for patients in the intensive-therapy group, as compared to the standard-therapy group, assuming a rate of 2.9% per year in the standard-therapy group and a planned follow-up of 5.6 years. A. 10,000 patients would be enrolled B. Rate of death of 2.9%/yr in standard-therapy group C. Patients would be followed for 5.6 years D. Both 2 and 3
D. Both 2 and 3 pay attention to the last question :) A. is not the answer because they said nothing about the amount of people enrolled power of 89% what is the
50
What can you conclude about the results for the outcomes of death, MI, and stroke? (α = 0.05) A. Rate of death was significantly greater in intensive-therapy group B. Rate of MI was significantly greater in the intensive-therapy group C. Rate of stroke was significantly greater in the intensive-therapy group D. Both 1 and 3
the standard therapy rate of death was higher in intensive than standard but the eoppotise is true for MI answer: A. The rate of death was significantly greater in intensive-therapy group
51
If everything else remains constant, what will happen as the sample size increases? A. The power of the study increases B. The study’s alpha increases C. The difference in the primary endpoint between the two groups increases D. The standard deviation increases
A. The power of the study increases
52
“We calculated a sample size (n = 200) sufficient to detect a 20% difference between the two groups’ cure rates with 80% power and α=0.05.” The study enrolled 189 patients and detected an 18% difference between the two groups’ cure rates. Which one of the following is true?
A. If p < 0.05, the results ARE statistically significant B. If p < 0.05, the results are NOT statistically significant because the study was underpowered