drug lit exam 1 - bio stats Flashcards

1
Q

variables:

Determine if a variable is nominal, ordinal, interval, or ratio

Recognize dichotomous endpoints

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variables

A

Variable:
- Anything that can be observed or measured in a clinical experiment

Dependent Variable:
- The outcome of interest
- What should change as a result of the researcher’s intervention

Independent Variable
- The researcher’s intervention
- What is being manipulated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Types of Variables

A

Discrete Data
- Can only be whole numbers
- Example: you can’t have 2.13 children

Continuous Data
- Can take any value, within a defined range
- Example: you can divide BP mmHg into tenths of a mmHg, hundredths, even thousandths!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Another way to think about variable types…

A

Nominal
- Different categories, in no particular order

Ordinal
- Ordered categories, where the distance between categories cannot be considered equal

Interval
- Equal distances between values, but the zero point is arbitrary (not the same for each variable)

Ratio
- Equal distances between values, with a meaningful zero point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Variable Examples

A

Nominal
- No category is “higher” or “better” than others
- Every study participant in a sample will be placed into one of the categories
Also referred to as dichotomous when there are 2 options

Examples
- Medical diagnoses (“Diabetes”; “No diabetes”)
- Race or Nationality (“Asian”; “African”; “European”)
- Age groups (“< 18 years”; “18-44 years”; “> 44 years”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Variable Examples

A

Ordinal
- There is ordering of these values, but the distance between values is not equal

Examples:
- Excellent/Satisfactory/Unsatisfactory
- Likert Scales (strongly agree, agree, neutral, disagree, strongly disagree)
- Cancer Stages I – IV
- The order of finishing a race

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variable Examples

A

Interval
- Ordering of values and equal distance between values
- The zero point isn’t meaningful, and therefore can be changed

Example
- Temperature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variable Examples

A

Ratio
- Ordering of values and equal distance between values
- The zero is meaningful

Examples
- Weight (kg/lbs)
- Height (cm/inches)
- Blood pressure (mm Hg)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variable Type & Assumptions

A

Nominal
Named categories

Ordinal
Same as nominal plus ordered categories

Interval
Same as ordinal plus equal intervals

Ratio
Same as interval plus meaningful zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Learning objective for descriptive stats

A

Given a mean and standard deviation of a normally distributed sample, calculate the range that 95% of the data points fall between

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Two categories of statistics

A

1) Descriptive Statistics

  • Used for presenting, organizing, summarizing data
  • Can summarize your data set with just a few key numbers

What you need to know about:
Mean
Median
Mode
Interquartile range
Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Two categories of statistics

A

2) Inferential Statistics

Used to generalize data from a sample to a larger population
Used to identify “statistically significant” differences

Examples
Student’s t-test
Chi Squared
ANOVA

Understanding when and how to use these statistics won’t be a focus for the biostatistics lectures in this course

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Measures of Central Tendency

A

Mean:
The average value
Sum all values and divide by number of values (N)
AKA the “typical value”
Only okay to use to describe interval and ratio data!
Affected by outliers

If a study reports the mean for ordinal data, critique that as bad statistics!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

median

A

The middle value

The 50th percentile

Arrange all values from smallest to largest and pick the middle number

Used to describe ordinal data (interval and ratio are okay too)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

mode

A

The most frequently occurring value or category

Used to describe nominal data (interval, ordinal, and ratio are ok too)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Quick Quiz! Calculate the mean, median, and mode for this dataset

7, 4, 2, 4, 8

A

Mean = 7+4+2+4+8 = 25/5 (n) = 5
Median = middle value: 2, 4, 4, 7, 8 = 4
Mode = 4 (most frequently occurring)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which data set has the largest mean?

A

none of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Measures of Dispersion

A

How closely the data cluster around the measure of central tendency

Range:
The difference between the highest and lowest value
Measures the variability of the data
Advantage: simple to calculate and understand
Disadvantage: affected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Interquartile Range

A

The interval between the 25th and 75th percentiles

The middle 50% of values

A measure of variability

Directly related to the median

Advantage: Not affected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Standard Deviation (SD)

A

Very common estimate of data variability

Estimates the scatter of data points about the sample mean

Often necessary when running inferential statistics

68% in 1 SD
95% in 2 SD
99% in 3 SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

statistically significant

A

p value less than 0.05

22
Q

Learning objectives for p value alpha typ1 I error

A

Interpret a p value, with respect to the alpha of the study and risk of type I error

Given clinical study results, identify any statistically significant differences between groups

23
Q

Clinical studies

A

Assume that an intervention (e.g., medication) has an effect

Clinical study is an experiment to see what the effect of the intervention is:
Based on the results of that one experiment, can we apply those results to the overall patient population??

Hypothesis testing:
Null hypothesis
Alternative hypothesis

24
Q

Hypothesis Testing

A

Research Hypothesis (H1)
- The treatment (or intervention) has an effect on the experimental group
- Effect seen by comparing the experimental group to the control group

Null Hypothesis (H0)
- Any difference seen between the experimental and control groups is due to chance alone
- The intervention does not have an effect
- The hypothesis we are actually doing statistics on

25
Q

The role of chance

A

Imagine you want to see if the true chance of getting heads on any single toss = 0.5 (50%)

Toss coin 10 times

Would you expect to see exactly 5 heads and 5 tails?

26
Q

Type I error

A

Inappropriately concluding that there is a true difference between 2 study groups when the difference is due to chance alone
A “false positive”

Components of a study that are prone to type I error
Subgroup analysis (especially post hoc)
Secondary endpoints

27
Q

Alpha (α)

A

The acceptable probability that the difference between study groups is due to chance alone

Accepted by researchers at the start of the study (before the experiment happens)

Usually 0.05 (5%) in clinical research
- The difference between study groups has less than a 5% probability of occurring due to chance alone

28
Q

P value

A

The actual probability that the difference between 2 study groups is due to chance alone at the end of the study (after the experiment happens)

When P < alpha, the difference seen is statistically significant
- (The probability that the difference seen is completely random is < 5%)

Sometimes referred to as a positive study result

29
Q

Learning objectives for beta type II error

A

Explain how various factors can affect the power of a study

Demonstrate understanding of the relationship of beta to power and type II error

30
Q

Preclass online lecture Part 3: P values, alpha, and type I error

A

Focused on the importance of making sure that a study doesn’t inappropriately conclude that there is a true difference between study groups when the difference seen is due to chance alone
Remember the type of error?
- Type I error

What about an error when there IS a true difference between study groups but the study fails to detect that difference?
- Type II error

31
Q

Type II error

A

The researchers aren’t able to find a difference between the experimental and control groups, but a difference really exists

Failing to show statistical significance, when there really is a true difference between study groups

Falsely concluding that any difference seen between study groups is due to chance alone

A “false negative” result

32
Q

Beta (β)

A

The probability of making a type II error

Determined during study design phase

Can be anything, often between 5-20% (β = ___ - ___)

33
Q

Study “power”

A

The chance that if a true difference exists, it will be successfully detected

The probability of not making a type II error

Power = 1 – β memorize this equation!

Determined during study design phase by doing a power calculation

34
Q

What things can affect the power of a study?

A

Sample size (n)
- As sample size increases, power increases

The effect size/event rate
- As effect size or event rate becomes larger, power increases

The duration of the study
- As study duration increases, power increases

A type II error may occur
- If a study fails to enroll enough patients
- If the researchers overestimate the size of the treatment effect
- If the study is terminated early

Any of these things can result in a study being underpowered

35
Q

Age (years)

A. Dichotomous – means two options

B. Continuous – can be a whole or decimal number

A

B. Continuous – can be a whole or decimal number

36
Q

Experiencing one or more hospitalization(s)

A. Dichotomous – two options

B. Continuous – decimal or whole #

A

A. Dichotomous – two options

because the patient experienced one or more hospitalizations so the answer is yes or no which makes it dichotomous

37
Q

Time (hours)

A. Dichotomous
B. Continuous

A

B. Continuous

38
Q

Type of variable: Stage of cancer (0, 1, 2, 3, 4)

A. Nominal
B. Ordinal
C. Interval
D. Ratio

A

Ordinal - still categories, because it is in a particular order but do not have a set amount of spacing between the different stages
and the stages are not on a number line

also can be a class of heart failure or order of finishing a race

39
Q

Type of variable: duration of diabetes (mean # years)

A. Nominal
B. Ordinal
C. Interval
D. Ratio

A

nominal or ordinal are categories so it is either they have diabetes or they do not
but this is the mean # years so we can rule out A or B

interval data there is an arbitrary zero in the data but the ratio and ratio has a true zero

so is the true zero mean that the duration of diabetes means you never had diabetes

answer: ratio, data is reported as SD or mean

if reported as an n then it is usually nominal or ordinal

40
Q

Which data set has the largest mean?

A. A
B. B
C. C
D. A = B = C

A

D. A = B = C

41
Q

Which data set has the largest standard deviation?
A. A
B. B
C. C
D. A = B = C

A

C. C

42
Q

In the data set below, which is the most appropriate measure of dispersion? (Example: days of hospitalization in a sample of 8 patients) - which gives a better spread for the people in the data

1,2,2,3,3,4,5,90

A. Range
B. Interquartile range
C. Both are appropriate

A

B. Interquartile range

there is an outlier that we wan to get rid of so we use the IQR

43
Q

Given the following data, what are the lower and upper fasting glucose values that 95% of the patient sample would fall between?

A. 80, 120
B. 90, 110
C. 97, 103

A

need mean +/- 2 SD for 95%

A. 80, 120

44
Q

A study finding that is statistically significant will always be clinically significant - means something to the clinician that they can use

True
False

A

true

45
Q

Which of the following endpoints were statistically significant (α = 0.05)?

A. Death
B. Cardiac arrest
C. Stroke
D. Hospitalization
E. Both 2 and 4

A

E. Both 2 and 4
because both of its p-values are less than 0.05

for death: it is not statistically significant because the p-value is greater than 0.05

for cardiac: the rhythm control group had a higher chance of having cardiac arrest, it is statistically significant because the p-value is less than 0.05

for stroke: the rhythm control group had a higher chance of having a stroke, it is not statistically significant because the p-value is greater than 0.05. there is a 79% probability that the difference we are seeing is due to chance alone

for Hospitalization: the rhythm control group had a higher chance of having a Hospitalization, it is statistically significant because the p-value is less than 0.05. there is a 0.001% probability that the difference we are seeing is due to chance alone

46
Q

What is the “alpha” of a study?

A. Acceptable limit to the probability of making a type I error
B. Acceptable limit to the probability of making a type II error
C. Risk of false positive
D. Risk of false negative
E. Both 1 and 3
F. Both 2 and 3

A

E. Both 1 and 3

usual alpha: 0.05
relates to a type I error so that is why it is also a false positive

47
Q

Which of the following describes a clinical situation consistent with type II error (false negative)?

A. A patient is diagnosed with cancer but does not really have cancer
B. A patient is diagnosed with cancer free, but really does have cancer
C. A patient is diagnosed with cancer and really does have cancer
D. A patient is diagnosed as cancer-free and does not really have cancer

A

B. A patient is diagnosed with cancer-free but really does have cancer

type II error: false negative so not finding something that is actually there

A. would be type I error this is a false positive

48
Q

If a study has the power of 85% to detect a difference, what is the probability of type II error?

A. 85%
B. 0.85%
C. 15%
D. Unable to calculate

A

C. 15%

power = 1 beta

49
Q

What assumptions did the researchers make when doing their power calculation?

The study was designed to have a power of 89% to detect a 15% reduction in the rate of primary endpoint (death/MI/stroke) for patients in the intensive-therapy group, as compared to the standard-therapy group, assuming a rate of 2.9% per year in the standard-therapy group and a planned follow-up of 5.6 years.

A. 10,000 patients would be enrolled
B. Rate of death of 2.9%/yr in standard-therapy group
C. Patients would be followed for 5.6 years
D. Both 2 and 3

A

D. Both 2 and 3

pay attention to the last question :)

A. is not the answer because they said nothing about the amount of people enrolled
power of 89% what is the

50
Q

What can you conclude about the results for the outcomes of death, MI, and stroke? (α = 0.05)

A. Rate of death was significantly greater in intensive-therapy group
B. Rate of MI was significantly greater in the intensive-therapy group
C. Rate of stroke was significantly greater in the intensive-therapy group
D. Both 1 and 3

A

the standard therapy

rate of death was higher in intensive than standard but the eoppotise is true for MI

answer: A. The rate of death was significantly greater in intensive-therapy group

51
Q

If everything else remains constant, what will happen as the sample size increases?

A. The power of the study increases
B. The study’s alpha increases
C. The difference in the primary endpoint between the two groups increases
D. The standard deviation increases

A

A. The power of the study increases

52
Q

“We calculated a sample size (n = 200) sufficient to detect a 20% difference between the two groups’ cure rates with 80% power and α=0.05.” The study enrolled 189 patients and detected an 18% difference between the two groups’ cure rates. Which one of the following is true?

A

A. If p < 0.05, the results ARE statistically significant
B. If p < 0.05, the results are NOT statistically significant because the study was underpowered