objectives - michael Flashcards

Question 1

Q

what are the three basic types of data?

Answer

A

nominal, ordinal, numerical

Question 2

Q

what is a nominal scale? what is it used to measure?

Answer

A

assigns individuals one (and only one) category without any specific ordering (e.g. positive/negative, alive/dead, A/B/AB/O, etc.). Reported as the percentage of a population that falls into a particular category.

Question 3

Q

what is an ordinal scale? what is it used to measure?

Answer

A

assigns individuals one (and only one) categy with some specific order (e.g. stage I, II, III, IV Hodgkin’s lymphoma). Since the categories are still often qualitative in nature, it does not make sense to summarize the data in terms of an “average.” Reported as the percentage of a population that falls into a particular category.

Question 4

Q

what is a numerical scale? what is it used to measure?

Answer

A

can be discrete or continuous; can be summarized as an average.

Question 5

Q

what is a distribution of data? how would you graph it?

Answer

A

a set of values of a variable over some population (e.g. pulse rates of smokers over the age of 60). Can be graphed as a histogram with the values of the variable on the x-axis and the frequency of occurrence of that value on the y-axis.

Question 6

Q

what’s a normal distribution?

Answer

A

a distribution with some nice features: 68% of the population falls within 1 standard deviation of the mean, and 95% of the population falls within 2 standard deviations of the mean.

Question 7

Q

what is mean?

Answer

A

average of the values of a variable. Take the sum, divide by n, the number of values. If I have 3 stethoscopes, and you have 5 stethoscopes, the average number of stethoscopes between the two of us is (3 + 5)/2 = 4.

Question 8

Q

what is the median?

Answer

A

the midpoint, the 50th percentile. Out of 3 values, the 2nd value (sorted highest to lowest or lowest to highest). Out of 5 values, the 3rd value. And so on. If there are an even number of values, take the mean of the middle two values.

Question 9

Q

what is the range?

Answer

A

the highest value in your dataset minus the lowest value in your dataset

Question 10

Q

what is the variance?

Answer

A

take every value from your dataset and subtract the mean from that value. Then square those numbers, add them all up, and divide by the number of values.

Question 11

Q

what is the standard deviation?

Answer

A

Square root of the variance.

Question 12

Q

understand the concept of sampling and estimation of population parameters

Answer

A

a) Flip a coin 10 times. It comes up 6 heads and 4 tails. This is your sample. b) From this sample, you estimate that the coin comes up heads 60% of the time it is flipped. This is an estimation of a “population parameter.” Obviously, this estimate is biased toward heads. You can flip the coin a few more times and see that as the number of coin flips increase, the bias decreases. c) You can do this to estimate the number of people in a population with a certain disease as well: consider a smaller population (sample) and extrapolate from that sample to your entire population. Make sure your samples are representative of the population, however!

Question 13

Q

define and contrast qualitative versus quantitive assessments of clinical uncertainty

Answer

A

a) Qualitative: likely/unlikely, probably/possible, suspicious/can’t rule out b) Quantitative: probabilities on a scale from 0 (impossible) to 1 (certain)

Question 14

Q

define prior probability

Answer

A

probability a patient has a disease based on prior clinical data before some additional test is conducted which will result in additional information

Question 15

Q

define posterior probability

Answer

A

the updated probability after the results of the test come back.

Question 16

Q

what is sensitivity?

Answer

A

probability of a positive test in a population of only persons who have the disease (true-positive rate; how many people who have the disease will test positive)

Question 17

Q

what is specificity?

Answer

A

probability of a negative test in a population of only persons who do not have the disease (true-negative rate; how many people who do not have the disease will test negative)

Question 18

Q

what is positive predictive value? what is the formula for it?

Answer

A

probability of disease in persons with a positive test

sens*p

divided by

sens*p + (1-p) * (1-spec)

Question 19

Q

what is the negative predictive value? how do you calculate it?

Answer

A

probability of no disease in persons with a negative test

spec*(1-p)

divided by

spec * (1-p) + p*(1-sens)

Question 20

Q

find ppv and npv: You have a patient with a cough and a history of TB exposure. You know that a test for TB has a sensitivity of 0.75 and a specificity of 0.80, and the test is to be used in a population having a TB-prevalence of 20%

Question 21

Q

List considerations in choosing the right diagnostic test for a given clinical situation.

Answer

A

a) Typically, there is a trade-off between sensitivity and specificity
b) “Spin”: specific test rules in diseases
c) “Snout”: sensitive test rules out diseases
d) For continuous measurements, apply a “cut-off point”
i) Low cut-off: low specificity and high sensitivity
ii) High cut-off: high specificity and low sensitivity

Question 22

Q

what is decision analysis? what are the steps?

Answer

A

a) A quantitative approach to making trade-offs in clinical decisions (e.g. quality of life versus years lived or short-term vs. long-term risks)
b) Step 1: all potential outcomes of each strategy under consideration are represented in a decision tree
c) Step 2: Probabilities are assigned to each clinical outcome
d) Step 3: Each outcome is assigned some quantitative value

e) Step 4: Expected value of each strategy is calculated

Question 23

Q

what is the standard error? how do you calculate it?

Answer

A

Standard error quantifies the variation of the sample mean

Question 24

Q

what is the confidence interval? how do you calculate it?

Answer

A

quantifies the accuracy of the sample mean by providing an interval based on the sampling distribution

Question 25

Q

what are the three steps to hypothesis testing?

Answer

A

i) Define a null hypothesis: usually the null hypothesis is paradoxically what you are trying to support
ii) Compute a test statistic: some test to describe the difference between your observed data and the null hypothesis
iii) Draw a conclusion: one rejects the null hypothesis if the test statistic t is less than -2 or greater than +2. (2 standard deviations)

Question 26

Q

what is type I error?

Answer

A

rejecting the null hypothesis when it is true; akin to a false positive

Question 27

Q

what is type II error?

Answer

A

failing to reject the null hypothesis when it is false; akin to a false negative

Question 28

Q

how do you interpret a p value for effect?

Answer

A

convenient way to present the results of a statistical test; number between 0 and 1 which represents the probability than an observed effect is due to chance
Usually 0.05 is the cut-off for presenting a p-value as statistically significant

Question 29

Q

what is the relationship between confidence intervals and hypothesis testing?

Answer

A

If a confidence interval does not include 0 (no effect), then it is safe to say the p-value is less than 0.05.

Question 30

Q

what is statistical power? how can you increase it?

Answer

A

The power represents the probability that the study will exclude the null hypothesis if indeed the alternative hypothesis is true (the type II error rate). One can increase the power of a study by increasing the sample size.

Question 31

Q

what are the advantages and disadvantages of larger studies

Answer

A

cost more but have less variable results

Question 32

Q

what is a contingency table?

Answer

A

relationship between two categorical variables in the form of a table

Question 33

Q

what are the steps for hypthesis testing for a 2x2 contingency table?

Answer

A

Define null hypothesis
test null hypothesis using the chi-squared test
take the square of the difference between the observed and expected value
divide by the expected value.
compare your chi-squared value to a standard value (gives you your p value)

Question 34

Q

what is the equaton for degrees of freedom? what does it tell you?

Answer

A

(r – 1) x (c – 1)

If you know degrees of freedom and the chi-squared statistic, you can find a p-value.

Question 35

Q

what is required about your data for the independent samples t-test to be used?

Answer

A

must be independent samples…

ie must be comparing the means of two independent samples (not the same patients at different time intervals, etc.)

Question 36

Q

how do you calculate standard error of the mean difference?

Question 37

Q

how do you calculate the test statistic (t) for a sample with two variables?

Question 38

Q

for what values is a t test significant?

Answer

A

if t > 2 or t < -2

Question 39

Q

when would you use a paired samples t test?

Answer

A

when you have two data sets that aren’t independent - for example, patients before and after treatment

Question 40

Q

what are the steps for a paired samples t test?

Answer

A

Step 1: compute a difference score for each pair of observations

Step 2: compute a mean of the difference scores and a standard deviation of the difference scores

Question 41

Q

what is the equation for the test statistic (t) in a paired samples t test?

Question 42

Q

what does survival analysis measure?

Answer

A

time until occurrence of some event (infection, disease relapse, death) after some initial observation period (initial therapy or treatment)

Question 43

Q

what are the potential problems with survival analysis methods? (2)

Answer

A

some patients may drop out of the study (these patients are said to be censored) and the distribution of the data is likely to be skewed (cannot use normal distribution methods)

Question 44

Q

how are survival functions displayed?

Answer

A

as a graph called a survival curve: represent the probability that an individual will survive beyond time t

Question 45

Q

what is the median survival time?

Answer

A

the time such that 50% of the subjects will experience the event before the time and 50% of the subjects will experience the event after the time

Question 46

Q

what is log-rank test?

Answer

A

a statistical test to determine if two survival curves are different