Biostatistics Flashcards

1
Q

What is biostatistics/ the purpose

A

The collection and analysis of data ( so statistics), except specifically related to understanding the effects of a drug or medical procedure on people and animals

Its used to understand medical and pharmacy journals and helps us be able to answer clinical questions from patients and providers. Ex: on a question we should be able to determine if a drug is appropriate for a patient based on if they meet the exclusion criteria for a study ex: consider relative risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a study manuscript

A

A description of the research completed with the results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is peer review

A

When a researcher sends their manuscript to a journal and the editor sends it to experts in the field to be reviewed to assess the research design, the methods, the value of the results, the conclusion, how well it’s written, and whether it is appropriate/fitting for the journal. Reviewers decide whether to accept (usually with revisions or to reject it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

List the steps to publication

A

Research Question
Design the Study
Enroll the Subjects
Collect the Data
Analyze the Data
Publish the Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is continuous data + two types

A

Data (usually numerical) that has a logical order with values that continuously increase or decrease by the same/a measurable amount

Two types are ratio data and interval data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ratio data

A

continuous type of data with an equal difference between the values and there IS a meaningful zero. ex: age, height, BP, weight - ex: zero blood pressure is meaningful because the pt would be dead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is interval data

A

continuous type of data with equal difference between values but there is NO MEANINGFUL ZERO
ex: celsius and farenheit scales - the zero temp doesn’t mean no temperature, but it’s not meaningful because it just means its cold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is discrete data and the two types

A

Categorical data

Two types: nominal and ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is nominal data

A

It’s yes/no data. Data that goes into arbitrary categories (names) like male vs female, ethnicity, marital status, mortality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is ordinal data

A

It is ranked and in logical order such as a pain scale NYHA Functional class but the categories do not increase by the same amount. (pain of 4 is higher than 2, but that doesn’t mean it is twice the amount)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the measures of central tendency and when are they preferred for which data types

A

Mean (preferred for continuous data that is normally distributed)
Median (preferred for ordinal data or continuous data that is skewed)
Mode (nominal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe standard deviation

A

how spread out the data is away from the mean SD+/- a certain amount from the mean.

68% of the data will fall between 1 SD of the mean
95% of the data will fall between 2 SD of the mean
99.7% of the data will fall between 3 SD of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the range

A

The highest value - the lowest value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the mode

A

The value that occurs most frequently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a gaussian or “normal” distribution vs skewed data

A

It’s a bell curve that is normal and usually seen in continuous data with large sample sizes. The curve is symmetrical.
68% of the values fall within 1 SD of the mean and 95% of the values fall within 2 SD of the mean. You can use mean** or median or mode to describe your middle.

You lack normal distribution or have “skewed data” when the sample size is small or there are outliers in the data - when there is a small number of values, the outlier has a large impact on the mean. In these cases the median** is a better indicator of central tendency. Wherever the outliers are the graph will skew to that direction. Median is used to describe the middle for ordinal data too.

Negative skew = left skew
Positive skew = right skew
(skew refers to the tail of the data not the hump)

Distortion of central tendency can be fixed by collecting more values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

independent variable

A

Changed /manipulated by researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

dependent variable

A

outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Null hypothesis

A

states that there is no statistically significant difference between groups. It’s described as H0 or “Hnot”

  • The null hypothesis is what the researcher tries to disprove and the alternative hypothesis (Ha) is what they’ve made up, are testing, and trying to prove is acceptable as true.
19
Q

what is an alpha level and what does it mean in relation to the p value

A

represents the maximum error margin aka the tails on a normally distributed bell curve - The alpha level is determined by the investigator and usually set at 5% or 0.05 and it can be lower (which is better) but that just requires more subjects, more treatment effect, and more data - aka more money.
If the alpha level is set at 5% and the p-value is actually < 0.05, this means that we can reject the null because the data is deemed statistically significant and the alternate hypothesis can be accepted

If the p value is greater than or equal to alpha (p>/= 0.05) the study failed to reject the null hypothesis and its not statistically significant

20
Q

If alternative hypothesis is accepted, phrasing that can be used:

A

This means you reject the null hypothesis or fail to accept it

21
Q

If null hypothesis is accepted, phrasing that can be used:

A

This means you accept the null hypothesis or fail to reject it

22
Q

The alpha level correlates with the values in the tails of a normally distributed graph

A

If 95% of values are within 2 SD of the mean, this correlates with an alpha of 5%

If 99.7% of values are within 3 SD of the mean, this correlates with an alpha of 1%

23
Q

Formula for confidence interval (CI)

A

CI = 1- alpha

CI can be expressed as a range of values (ex: 95% CI 6%-34%)

This means that you are 95% confident that the true value of ____ for the general population lies somewhere between 6-35%. The more narrow the range, the more precise and the wider, the less precise

24
Q

if alpha is greater than or equal to 0.05, what does this mean about the p-value and the significance and CI

A

that means p value is also greater than 0.05, CI is < 95% and its not statistically significant

25
Q

if alpha is equal to 0.05, what does this mean about the p-value and the significance and the CI

A

p value is < 0.05 and the CI is 95% so the conclusion is correct and statistically significant, and there is less than 5% chance it’s not

26
Q

if alpha is greater than or equal to 0.01, what does this mean about the p-value and the significance and the CI

A

p value is < 0.01 and the CI is greater than 99%, so its statistically significant and the conclusion is correct. There is less than 1% chance that it is not correct.

27
Q

if alpha is greater than or equal to 0.001, what does this mean about the p-value and the significance and the CI

A

p value is < 0.001 and the CI is greater than 99.9% so this is statistically significant, the conclusion is correct, and there is less than 0.01% that it’s not.

28
Q

if on the exam they don’t provide p value but they need to you to determine if something is stat significant, how would you do it? **

A

The result is stat significant if the 95% CI range does not include zero (with difference data , ex: difference in FEV1 between roflumilast and placebo)

Key: look for subtraction of values or the word “difference”

ex: Difference (95% CI) = 38 (18-58) vs 0.313 (-0.26-0.89)

29
Q

When comparing ratio data, how do we determine if the value is stat significant if we don’t have a pvalue

A

We need to check if the 95% CI range includes 1. If it does, that means it’s not stat sig. if it doesn’t that means it’s stat sig. If it includes one that means the values are too similar.

This is true no matter if the CI is relative risk, odds ratio, hazard ratio, or just difference.

30
Q

Type 1 error meaning and equation to determine type 1 risk

A

When the null hypothesis was rejected in error (and alt. hypothesis accepted) when it should have been accepted (and alt. hypothesis rejected). When there results said there was a difference between the values and there actually wasn’t

The probability of making a type 1 error is the same as the alpha value
a
ex: alpha is 0.05, study result is reported with p<0.05, so it is stat. sig. and the probability of a type 1 error is <5%

31
Q

Type two error and how do we know the probability of making a type two error

A

When the null hypothesis is accepted in error, when it should have been rejected. When there is a diff. between the two groups but the stats made it seem like there was not.
Beta is the probability of making a type 2 error.

Beta is set by investigators during study just like alpha is. usually set at 0.1 or 0.2 (means the risk of type II error is 10% or 20%)

The risk of type two errors increases if the sample size is too small, but we can decrease the risk by using a power analysis to determine the sample size needed to detect a true difference between groups. Power is the probability of avoiding a type two error.

32
Q

What is study power

A

The probability of avoiding a type two error aka the probability that a test will reject the null hypothesis correctly

Power= 1-Beta (which is the probability of a type two error)
The higher the power, the higher the risk of type 2 error.

Power is determined by the number of outcome values collected, the difference in outcome rates between the groups, and the significance (alpha level)

ex: if the beta is 0.2% , then there is a 20% chance of missing a true difference and making a type 2 error and the study power shows there is an 80% chance of avoiding type 2 error. Similarly, if beta is 0.1, there is 10% of type 2 error/accepting null in error, and 90% of avoiding that mistake. We can always decrease the beta and increase the study power by increasing the sample size

33
Q

What is risk

A

the probability of an event occurring when a drug is given (can also determine risk of no intervention/placebo)

Risk = number of subjects with unfavorable event/ total number of subjects in group

34
Q

What is the relative risk equation, and how do you interpret the value

A

the ratio of

risk in the exposed group/risk in the control group =% or decimal.

The answer means that the patients treated with the drug (or independent variable) are are ** ___ % ** AS LIKELY** to have progression of disease as placebo treated patients

RR of 1 or 100% means there is no difference in risk of outcome between the placebo and the intervention group so the intervention had no effect

RR of > 1 or 100% means the intervention group has a higher risk of outcome than the placebo so increased risk of the endpoint

RR of < 1 or 100% means that the intervention group has a lower risk of outcome than the placebo or intervention group so decreased risk of the end point

ex: 50% or 0.5 RR means there is still a 50% lesser chance of the intervention group causing harm than the control. 1.5% means there is 50% higher chance the intervention group has at causing the outcome than the control

35
Q

What is RRR (relative risk reduction) and how is it used

A

We calculate relative risk reduction in order to understand further (after calculating risk and relative risk) how much the risk of the outcome in reduced in the treatment group compared to the control group

RRR = (% risk in control group - % risk in treatment group)/(% risk in control group) can use percent or decimal

OR RRR= 1- RR (decimal form only)

The answer tells us that the treatment group is ___% LESS LIKELY to have the determined unwanted outcome

36
Q

Relationship between relative risk and relative risk reduction

A

RR = treatment “as likely” to cause the unwanted outcome as control
RRR = treatment “less likely” to cause unwanted outcome
RR+RRR = 100%

37
Q

What are two key reportable points we want to make sure we see in a study as clinicans

A

RRR and ARR because they give us a more clear picture on the reduction risk but also the incidence rate (if the reduction risk is high, but we have a low absolute risk reduction, it means that the true value of the drug in real life patients is minimal

38
Q

ARR

A

absolute risk reduction:
= %risk in control group - % risk in treatment group
because its expressed as a percentage it can be viewed as out of 100.

Means that “for every 100 patients, _____ fewer patients wi

39
Q

What clinical question can be answered by using number needed to treat vs. number needed to harm and what is the formula for both

A

“How many patients need to receive the drug for one patient to get benefit (NNT) or for one person to experience harm (NNH)?” This helps us understand the patients individual risk of taking the drug.

NNT= number of patients who need to be treated over a certain period over time (e.g. length of the study) in order for one patient to see a benefit (e.g. avoid HF progression)
NNT = 1/ (%risk in control - %risk in treatment) aka 1/ARR (decimal). Always round up for NNT no matter if 9.1 or 9.9 in order to avoid overestimating benefit

NNH = number of patients who need to be treated over certain period of time (e.g. length of study) in order for one patient to be harmed. (same formula as NNT)
always round down so that we don’t underestimate the risk of using the drug no matter if 9.9 or 9.1.

40
Q

What does an odds ratio tell us

A

the odds of an unfavorable event associated with a treatment or intervention in case control studies (because in case controls, you cannot use the relative risk calculation) mostly, but also in cohort and cross sectional studies

it calculates the odds of an outcome occurring with an exposure compared to it occurring without the exposure

the resulting percentage tells us that OR= 1.23, means that the drug/exposure has a 23% increased risk at causing the unfavorable event than the no drug/no exposure

41
Q

what is a case control study

A

study that enrolls patients who already have a clinical outcome or disease. The patients medical charts are reviewed retrospectively to look for possible exposures that increased the risk of them getting that disease or outcome

42
Q

Odds ratio formula

A

OR = (exposure present & outcome present) * exposure absent & outcome absent)
/ (exposure absent & outcome present) * (exposure present & outcome absent)

remember exposure and outcome and then remember numerator is present present and absent absent and denominator is present absent vs absent present

43
Q

what is the hazard rate and hazard ratio and how are they used

A

in an analysis of death or disease progression (survival analysis) we use hazard rate instead of “risk” because the stakes are higher so the terminology is more intense.

Hazard ratio is the rate of an unfavorable event occurs within a short period of time. Its the same formula as relative risk (RR)

Hazard rate = number of unfavorable events in group/total members of group

Hazard ratio = hazard rate in treatment group/hazard rate in control group

44
Q

how do we interpret the OR and the HR (odds ratio and hazard ratio)

A

OR or HR = 1; this means that there is no difference in rate of unfavorable outcome/primary endpoint between the treatment vs control group. Ex: OR or HR of 1 means there is an equal amount of death occurring in treatment and control group

OR or HR < 1; this means that there is a lower rate of the unfavorable event in treatment group than control group. ex: HR or OR of 0.5 means there are 50% less deaths in the treatment group as the control group

OR or HR >1; this means there is a higher rate of unfavorable event (outcome/primary endpoint) in the treatment vs control group. ex: OR or HR of 2 so there are 2x as many deaths that occur with the treatment than with control