Biostats - Week 1 Flashcards

Question

Why does increasing the n# make t and z scores get closer to the same value? Around what n value are t and z about the same?

Answer 1

T scores are calculated by the degrees of freedom (n-1), which means that t scores change based on the population size (n). As n gets higher and higher, the d.f. goes up. n > 100, t and z scores are about the same

Answer 2

the measure (of central tendency) with the greatest frequency. Is the high point on the graph and is NOT influenced by extreme values (unlike mean)

Answer 3

normal (gaussian) distribution

Answer 4

First, negatively skewed means the skewed data (tail) is to the left (heading towards negative x axis) and bulk is on R. Mode = peak, Mean = closest to skewed tail, and Median is in between the two

Answer 5

A disease in ENDEMIC when it is constantly present in a population or area. An endemic has a usual incidence/prevalence. Ex. Rhinovirus (common cold) EPIDEMIC means more cases of that disease than expected in a population/location within a time frame. Diseases that start as epidemics may drift into endemicity.

Answer 6

study of the distribution and determinants of disease frequency. Disease does NOT occur randomly; there are causes and/or preventative factors for disease. Epidemiology is the study of those things

Answer 7

Preclinical begins with the onset of the disease and ends once signs/sx of the disease manifest. Clinical phase begins with signs/sx and ends (ideally) with treatment/resolution

Answer 8

time from colonization to the point where have sx. In the preclinical phase

Answer 9

Experimental and Observational: - Experimental important in testing drugs - Observational are really important for learning causality. ex. figured out that Reye's syndrome was caused by kids with viral infections taking ASA for fever

Answer 10

Rate IS proportion per a specific time period. Proportion = (# of cases)/(population at risk) Rate = (# of cases)/(population at risk) IN A TIME

Answer 11

[# of people who ACQUIRE the disease] divided by [# of people at risk] IN A TIME ("associate in your mind the word 'acquire' with incidence")

Answer 12

(# of people that HAVE the disease)/(# of people at risk) ...at a given point in time

Answer 13

latent/undiagnosed diseases

Answer 14

Incidence rate = probability that healthy people will develop a particular disease DURING a specific period of time Prevalence rate = proportion of people in a population who HAVE the disease AT a given time (point prevalent or period prevalence)

Answer 15

prevalence is existing cup of liquid. Incidence is new cup pouring into prevalence. Coming out at bottom of prevalence cup are mortalities and cures

Answer 16

(# deaths)/(population) | Population is standardized to 10^n for a specific time interval. e.g. 10^3 = 1,000 or 10^5 = 10,000

Answer 17

Neonatal: (# deaths

Answer 18

is simply the # of deaths/population (10^n) in specific time period v. cause-specific death rate, which is (# of deaths due to certain cause)/population (10^n) in specific time period

Answer 19

(# of deaths attributed to a disease) / (# new cases identified) ex. total of 300 cases of disease with 50 new cases, 20 of whom have died. Death to Case ratio = 20:50

Answer 20

(# cause specific deaths among the incident cases) / (# of incident cases). Can ONLY calculate the proportion of fatal cases once the epidemic ends. ex. epidemic of a disease ends with 500 total cases, 250 of whom died Case fatality rate = 250/500 = 50%

Answer 21

crude birth = (# live births)/(population, 10^n) crude fertility = (# live births)/(women aged 15-44 yrs)

Answer 22

variance = standard deviation squared standard deviation = square root of the variance

Answer 23

LESS variance (which means lower SD) means MORE accurate/reliable data because less variation means your data is more clustered and more accurate around the mean. Overall idea = results for sample more closely represent the true result in the population

Answer 24

in a normal distribution, the proportion of data elements is CONSTANT for a given number of standard deviations above or below the mean

Answer 25

x percentile is the value below which x% of the data lie. e.g. 90% of the data lies below the 90th percentile

Answer 26

+1 SD = 84th percentile +2 SD = 98th percentile Even though 95% of the data LIES in between +/-2 SDs, this does not mean that +2SDs is the 95th percentile! It's the 98th b/c of the little tail of the rest of the data after -2SD

Answer 27

skewed (positive or negative) J-shaped (high frequency at R-most) bimodal: two peaks of highest frequency U-shaped: high frequency at both extremes

Answer 28

increase the sample size

Answer 29

Probability of getting heads = 50% The ODDS of getting heads = 50%/50% = 1 This is because Odds of an event happening = the probability it does happen / probability it doesn't

Answer 30

False! Cross-sectional study collects data ONLY at one point in time; it is not retrospective or prospective

Answer 31

cohort study

Answer 32

retrospective studies. Case-control studies, by definition, look backward in time

Answer 33

Controlled clinical trials are the only way to establish causation between exposure and illness. Cohort and case-control studies are only able to establish a statistical association, not actual causation

Answer 34

case-control studies (b/c they identify the cases at the start of the trial)

Answer 35

cohort studies

Answer 36

occur when systematic difference between either: - those participating in the study and those who do not, or - those in the tx arm of the study and those in control group ex. if study conducted at a hospital where pts with that disease are more likely to be referred, then that sample of pts probably doesn't accurately represent the population

Answer 37

False positive alpha error incorrectly reject a TRUE null hypothesis we think there's an effect, when really there is NOT

Answer 38

Fasle negative beta error incorrect accept a FALSE null hypothesis we think there is not an effect, when really there IS

Answer 39

Correlation is a measure of the variables' statistical ASSOCIATION, not of their causal relationship. Correlation does not equal causation.

Answer 40

1. no difference between the two groups | 2. any observed differences are due to chance

Answer 41

Two-tailed: There is a difference between the two groups One-tailed: the mean of the trial group is greater than the mean of the control group

Answer 42

the probability level at which it's decided that the null hypothesis is INCORRECT is the significance level (alpha)

Answer 43

If you plot the frequency distribution of the MEANS of infinite # of random samples, then: 1. it will be a normal distribution, and 2. the distribution mean - i.e. sample mean (mu x-bar) - will be the same as the population mean (mu)

Answer 44

the +/- limits of the area of acceptance range (accept the null). Outside the critical value range = area of rejection (reject the null). * Must find critical values by looking at T score table. Based on degree of freedom (n - 1) and then look for value under (.05 for two-tailed). ex. +/-2.262 for df = 9

Answer 45

measures how much the sample mean deviates from the population mean standard error = SD (x-bar) = SD/(sqrt of n)

Answer 46

the number of Estimated Standard Errors that the sample mean lies above or below the hypothesized population mean Talc = [sample mean - hypothesized population mean] / est standard error of the sample

Answer 47

t-test compares the means between 2 groups ANOVA: compares means of 3 or more different populations

Answer 48

Nominal data, since chi square is a test of proportions between groups (categorical)

Answer 49

Spearman's rank CORRELATION test Wicoxon rank SUM test Mann-Whitney test

Answer 50

aims to prevent type 2 error by ensuring adequate study size, involves: - fixing customary power to 80% and - fixing level of significance to 5%

Answer 51

only wants to study (ensure) that the intervention is not worse than current standard of care. Is a 1-tailed analysis that doesn't need as many patients

Answer 52

case-control studies are ALWAYS retrospective

Answer 53

1. easy and inexpensive 2. can study multiple risk factors 3. since you identify patient cases at the beginning of the study, it's the best way to study rare diseases

Answer 54

1. highly prone to bias & confounding (especially recall and selection biases) 2. hard to identify a truly matched population (e.g. similar in severity of illness, age-matched, etc)

Answer 55

Surgery and pregnancy. ex. Can't randomize people to get surgery or not. For these, rely on observational studies

Answer 56

systematic error in the study design that produces results "systematically" different from the truth

Answer 57

1. Selection (sampling) Bias: selection of pts doesn't represent the population its supposed to represent (e.g. too old or too well-educated) 2. Recall Bias: exists ANY time historical self-report info is collected from the respondents 3. Measurement Bias: just means something wrong w/ way it's being measured (instrument or observer)

Answer 58

RRR = Relative Risk Reduction - the RATIO of the risk rate in disease group / risk rate in control group. ex. 12%/20% = RRR of 0.60 ARR = Absolute Risk Reduction - % risk in control group - % risk in disease group. Since you just subtract, the ARR is always LESS than the RRR. Ex. 20% - 12% = ARR of 8%

Answer 59

NNT = 100/ARR (if ARR is %) or 1/ARR (if ARR is in decimals) ARR = % risk in control group - % risk in disease group

Answer 60

these are the 2 major types of clinical studies. Experimental mostly refers to randomized controlled trials Observational (just watching) includes cohorts, case-control, cross-sectional, and case reports

Answer 61

RR = 1: no difference RR > 1: Increased risk RR

Answer 62

Odds = (probability of the event) / (probability of the NOT event) or (probability of the event) / (1 - probability of the event)

Answer 63

the denominator Odds denominator = probability of the NOT event (or 1 - probability of the event), whereas Risk denominator = sum total of risk factor + non-risk factor present

Answer 64

_ _ _ . _ _ _ _ Begins and ends with letters. First letter is the disease group (e.g. M = musculo sys, N = GU system). First 3 characters total represent category. Next 3 = etiology, anatomic site, and severity respectively and the last one is an extension.

Answer 65

To establish a patient-physician relationship

Answer 66

notes contain more inconsistent and more outdated information

Answer 67

provide education for physicians regarding the copy/paste function use

Answer 68

the number of alerts received did not correlate with physicians' override rate

Answer 69

greater than 95% !!

Answer 70

summary study of previous trials to give us an overall result

Answer 71

False positives

Answer 72

That the screening test IS beneficial and will do more good than harm

Answer 73

Period between possible detection and occurrence of symptoms

Answer 74

C, should be an individual decision

Answer 75

B: men 55-69 yo Under 40 men = C 40-54 YO at average risk = C

Answer 76

D (recommend against)

Answer 77

``` 40-49 = C 50-74 = B >75 = I (Insufficient evidence) ```

Answer 78

``` 40-49 = C 50-74 = B >75 = I (Insufficient evidence) ```

Answer 79

Ovarian cancer Pancreatic cancer Prostate cancer Testicular cancer

Answer 80

Bladder cancer Oral cancer Skin cancer prevention

Biostats - Week 1 Flashcards

(107 cards)