Biostatistics Flashcards

Question

When interpreting OR, begin with the _____

Answer 1

tells us how much of the disease that occurs can be attributed to a certain exposure calculate among exposed individuals or an entire population

Answer 2

the risk of non-exposed people is not zero Ex: some people who get lung cancer do not smoke

Answer 3

(incidence in exposed) - (incidence in unexposed)

Answer 4

enumerate all members of the population N select n individuals at random (each has the same probability of being selected)

Answer 5

1. start with sampling frame 2. determine sampling interval (N/n) 3. select first person at random from first (N/n) and every (N/n) thereafter.

Answer 6

organize population into mutually exclusive strata, select individuals at random within each stratum

Answer 7

- models # of events out of n observations - 2 possible outcomes: success or failure - replications of process are independent - P(success) is constant for each replication

Answer 8

``` m = mean s = standard deviation ``` mean = median = mode and are located at the center of the distribution (not skewed) area under curve = probability of observation

Answer 9

1. Estimation | 2. Hypothesis Testing

Answer 10

sample statistics are used to generate estimates of the population parameter

Answer 11

Sample statistics are analyzed to either support or reject the hypothesis about the parameter.

Answer 12

No, the sample mean of the second sample is likely to be different from the first sample mean.

Answer 13

consists of multiple sample means

Answer 14

the "best" single estimate of that parameter

Answer 15

range of plausible values for the population parameter; carries a level of confidence

Answer 16

reflects the likelihood that the confidence interval contains the true, unknown parameter; 90%, 95%, and 99% If we repeatedly generate similar Confidence Intervals for the same population, 95% of those intervals will cover the true parameter.

Answer 17

As Confidence Level increases, Confidence Interval widens.

Answer 18

reflects the variability of the sampling distribution of the sample statistic

Answer 19

s/ square root of n ``` s = sample std. dev. n = sample size ```

Answer 20

As sample size increases, standard error decreases. Small samples have a lot of standard error

Answer 21

Z * s/square root of n ``` s = sample std. dev. n = sample size ```

Answer 22

confidence level

Answer 23

Sample mean +/- Z * s/square root of n

Answer 24

assumes nothing is going on, usually carries equality

Answer 25

the "research hypothesis" reflects the researcher's belief

Answer 26

1. Reject the null hypothesis | 2. Fail to reject the null hypothesis

Answer 27

1. null hypothesis | 2. Alternative hypothesis

Answer 28

1. Set up a null and research hypothesis 2. Determine significance level - acceptable rate at which a Type I error can occur. 3. Select test 4. Compute test statistic 5. Compute p-value 6. Compare p-value to alpha 7. Draw conclusion + summarize significance

Answer 29

1. Non-Directional (key word = difference); not equal 2. Directional (key word = greater, more, positive direction); greater than 3. Directional (key word = less, smaller, negative direction); less than

Answer 30

H0 : μ = x | HA : μ ≠ x

Answer 31

H0 : μ = x | HA : μ > x

Answer 32

H0 : μ = x | HA : μ < x

Answer 33

If test statistic > critical value = reject the null

Answer 34

the probability of observing the obtained data (or more extreme values) given the null hypothesis was true use to measure the significance of the test (is there enough evidence to reject H0?)

Answer 35

Reject; lower

Answer 36

(Alpha) Reject a true null hypothesis Most dangerous type of error

Answer 37

(Beta) Fail to reject a false null hypothesis

Answer 38

probability of making a Type I error error rate

Answer 39

probability of making a Type II error error rate

Answer 40

1-beta rate at which a test correctly rejects a null hypothesis

Answer 41

effect size; larger effect size; we can detect that more readily than a small effect size

Answer 42

determines whether 2+ categorical variables are independent or share an association

Answer 43

X^2 = the sum of (observed - expected)^2/expected

Answer 44

(column total * row total) / total

Answer 45

Df = (# of rows - 1) * (# of columns - 1)

Answer 46

measures the difference of 2 unrelated population means of continuous outcomes population variance is unknown

Answer 47

determines whether or not the means of more than 2 populations are statistically different

Answer 48

population parameters

Answer 49

measures the strength of the linear relationship between 2 continuous variables; equivalent to simple linear regression

Answer 50

estimates the value of one continuous variable corresponding to a given value of another variable

Answer 51

r; measures the strength of the linear relationship between x & y

Answer 52

indicates nature of relationships positive=direct; negative=inverse

Answer 53

percent variation attributed to predictor variables range from 0 (low variation explanation) to 1 (explains a lot of variation) Want to be high ;)

Answer 54

Y = β0 +β1x + error ``` Y = dependent/outcome variable X = independent/predictor variable β0 = intercept β1 = slope ```

Answer 55

What is the expected Systolic BP for a male with BMI=20? Y = SBP; X = BMI

Answer 56

helps to visualize relationships in bivariate data

Answer 57

r^2 = 0.4^2 = 0.16 x 100 = 16%

Answer 58

for categorical data

Answer 59

for continuous and ordinal data

Answer 60

for continuous data possibly with outliers or skewed data

Answer 61

fixed # of outcomes (nominal scale) 2 possible outcomes = Dichotomous variable

Answer 62

fixed number of outcomes with an inherent order ordinal scale

Answer 63

outcome (interval or ratio) may be any numerical value between a defined minimum and maximum E.g. GPA is any # between 0.0 and 4.0

Answer 64

1. use frequencies (counts of categories) 2. Use relative frequencies (percentages of categories) 3. present in table format 4. graph in a bar chart

Answer 65

1. central tendency: sample mean, (X bar) median (2nd Quartile), mode 2. Variability: sample std dev, variance, range, or Interquartile range (3rd - 1st quartile)

Answer 66

(s) spread from mean in original units

Answer 67

(s^2) spread from mean in squared units

Answer 68

3rd - 1st Quartiles

Answer 69

how spread out are values in the population?

Answer 70

graphical representation of the distribution of (continuous or ordinal) data shapes reflects distribution type, which determines which numerical summary to use

Answer 71

more observations in the middle mean=median-mode symmetric about the mean; area to the left/right = 0.5

Answer 72

more observations in the left, tail to the right mean > median

Answer 73

more observations to the right, tail to the left | mean < median

Answer 74

use box ( and whisker) plot shows sample minimum (Left whisker) + maximum (right whisker) 1st Quartile (left edge of box); 2nd Quartile (middle of box = median)/; 3rd Quartile (right side of box)

Answer 75

the kth percentile is a value where k% of all other values fall below: Scored in 90 Percentile = scoring better than 90% of people who took the exam

Answer 76

- 68% of population within 1 standard deviation of mean 95% of population within 2 standard deviations of mean 99% of population within 3 standard deviations of mean

Answer 77

Z = (X - mean)/Std dev transform any normal value into a standard value

Answer 78

- want to to know is there a difference in population means between two groups population variance is known

Answer 79

Does the sample come from a hypothesized distribution? for continuous data: divide data into intervals, then apply test

Answer 80

correlation

Answer 81

relative risk -or- odds ratio

Answer 82

risk of getting the disease with the risk factor compared to the risk of getting the disease without the risk factor (a/(a+b))/(c/(c+d))

Answer 83

ratio of the odds of having the disease with the risk factor compared to the odds of having the disease without the risk factor (a/c)/(b/d) -or- ad/bc

Answer 84

not significant; significant

Answer 85

Models the relationship between independent (X) and dependent (Y) variables; Dependent (Y) variable must be continuous

Answer 86

1 unit; B1 (slope)

Answer 87

directly; positive

Answer 88

inversely; negative

Answer 89

not related; not related

Answer 90

used when dependent (Y) variable is dichotomous Ex: Someone has the disease or not

Answer 91

odds ratio when X increases by 1 unit

Answer 92

models the relationship between dependent (Y) and independent (X) variables while also considering other variables that may affect the relationship (e.g. confounders) more than 1 independent (X) variable

Answer 93

collection of statistical procedures used for outcome that is time until an event From the time we start to observe, when does the event occur? goal: analyze survival experience of a population of interest

Answer 94

measure of time from the beginning of follow-up until the event for an individual e.g. days, weeks, months, years

Answer 95

occurrence of interest e.g. death, disease incidence, relapse, recovery

Answer 96

exact survival time is unknown three reasons 1. study ends before an individual experiences event 2. individual is lost to follow-up during the study 3. individual is withdrawn from the study (e.g. death before event of interest occurs).

Answer 97

1. right censored 2. left censored 3. interval censored

Answer 98

we know when survival time starts, but not when or if event occurs

Answer 99

start of survival period is unknown E.g. survival time of HIV patient begins at infection, but may not enter study until tested positive

Answer 100

the exact time of the even is unknown within the interval occurs in studies where subjects are not monitored continuously

Answer 101

in theory, are continuous and smooth Common application is to compare survival functions of two groups

Answer 102

method used to practically visualize survival curves for a study estimated as a step function 1 step down = 1 event occurred does not usually decrease to 0, not everyone will experience event during the study

Answer 103

if test rejects, the survival curves are significantly different; works for 2+ groups does not tell you which is better (visually compare or compare means)

Answer 104

- Consistency of measures - Are similar results produced under similar conditions - Uses Cronbach's alpha - high reliability does not mean high validity (accuracy)

Answer 105

an indicator of internal consistency ranges from 0 to 1 higher values = higher internal consistency

Answer 106

- Accuracy of a measure - Does the result actually reflect the true measure - Often difficult to know if a measure is valid

Answer 107

extraneous variable that distorts the true effect of the independent variable (exposure) on the dependent variable (outcome)

Answer 108

1. Stratification (single confounder) | 2. Regression (multiple confounders)

Answer 109

conduct separate analysis for each level of a confounding variable

Answer 110

the effect of an independent variable (X) on the dependent variable (Y) differs depending on the level of the third variable

Answer 111

models # of events out of infinite (in theory) observation not practical use when the event is rare or when modeling # of events over space of time

Biostatistics Flashcards

(142 cards)