Biostatistics Flashcards by Adam strohschein

Sampling

The process to determine who we are going to study/examine.
Purpose: To find out information without talking to everyone.

Two types of sampling
Nonprobability

Probability
Used most frequently in quantitative research
Systematic technique is used to select respondents – goal is to create a sample as representative of the population as possible

How well did you know this?

Not at all

Perfectly

Nonprobability Sampling

Less generalizability; problem with representativeness.

Lower confidence in findings.

Useful when probability sampling can’t be used.

Four common methods…
Purposive, convenience, snowball, quota

How well did you know this?

Not at all

Perfectly

Probability Sampling

Use to generalize to population at large

Works toward representativeness

Used in all large-scale surveys/observational studies

Avoids sampling bias – selecting atypical folks.
Numerous ways to introduce bias into your sample.

How well did you know this?

Not at all

Perfectly

Representative

Your sample is like the population
Random selection!  
All members have an equal chance of being selected…
EPSEM
Equal Probability of Selection Method
Probability samples are never perfect
More representative than non-probability
Probability theory allows us to estimate accuracy

How well did you know this?

Not at all

Perfectly

Element

Individual members of the population

How well did you know this?

Not at all

Perfectly

Population

The entire set of elements

How well did you know this?

Not at all

Perfectly

Sampling frame

List of all the elements in a population

How well did you know this?

Not at all

Perfectly

Parameter

Summary of a given variable in a population

How well did you know this?

Not at all

Perfectly

Statistic

Summary of a given variable in a sample

How well did you know this?

Not at all

Perfectly

Sampling distribution

All the possible random samples that could be selected

How well did you know this?

Not at all

Perfectly

Simple Random Sample

Base of sampling
Need a list (sampling frame)
Assign a number
Select by a random number
Random number list

How well did you know this?

Not at all

Perfectly

Systematic Sampling

Determine number needed
Divide population by sample number desired (we call this our sampling interval, denoted here by ‘k’)
List and number our elements
Randomly select start point
Select every k-th elements within groups
Caution: avoid periodicity!

How well did you know this?

Not at all

Perfectly

Stratified Sampling

Possible modification of previous techniques
Random sample from sub populations
Betters representativeness
Decreases some sampling error
Homogenous subsetscertain number of elements within subsets
Allows oversampling

How well did you know this?

Not at all

Perfectly

Cluster Sampling

More complex methodologically (not conceptually, I hope)
Cluster = Groups of elements
Multi-stage
Basic stages/steps: listing and sampling
Helps with cost and dispersed populations
Increases sampling error potential
two samples – double the error opportunity

How well did you know this?

Not at all

Perfectly

Comparability (of control & exp groups)

Randomization
Recruited folks (who may have been selected using nonprobability sampling techniques) are randomly placed into control and exp. groups.
Matching
 Assign people to group based on characteristics so groups match.

How well did you know this?

Not at all

Perfectly

Sampling Error

Study These Flashcards

Variation in values of your sample mean compared to the population mean
Because of sampling error, we probably won’t always have completely accurate estimates
Deviation between sample results and population
Reduce by:
Increase sample size
Increase homogeneity

THE NORMAL CURVE

Study These Flashcards

Characteristics (from central limit theorem):
Theoretical distribution of scores
Perfectly symmetrical
Bell-shaped
Unimodal
Tails extend infinitely in both directions
Mean, median, and mode are equal

NOTE: CENTRAL TENDENCY AND DISPERSION OR VARIABILITY.

Assumption of normality of a given empirical distribution makes it possible to describe this “real-world” distribution based on what we know about the (theoretical) normal curve
We use this assumption to generalize sample findings to a population

.68 of area under the curve (.34 on each side of mean) falls within 1 standard deviation (s) of the mean
In other words, 68% of cases fall within +/- 1 s
About 95% of cases/values fall within 2 s’s
About 99% of cases fall within 3 s’s

The z-distribution

Study These Flashcards

Just a special case of the normal dist.
Idealized mean of 0 and s.d. of 1
Allows us to use a corresponding z-table to look up critical values

Common critical z-scores (set by conf. level – see next slide):

65 = 90% CL
96 SE = 95% CL
58 SE = 99% CL

Confidence level

Study These Flashcards

(also called significance level)
Probability our sample statistics fall within a given confidence interval.
We set this ahead of time and denote as alpha (α). Most frequently, it’s α = .05 (95%).

Confidence interval

Study These Flashcards

Range within ‘true’ parameters should lie, a range of values around the estimate (point estimate)
Upper and lower limit for the confidence level
Many of the biomedical books use CI = mean +/- 1.96(standard errors), but this assumes a 95% confidence level (that’s where they are getting the z-score of +/-1.96).

Setting up a CI

Study These Flashcards

You need to set a level of confidence ( alpha). Often, as we said,  = .05, or 95% confidence.
Calculate the mean (or proportion). On exams, this will likely be provided.
Calculate the standard error, also usually provided. (if not, they will give a standard deviation. Calculation for SE = s.d./√N)
Based on #1, we will know how many standard errors to use – (precision on this from a z-scores table).
1.65 = 90% CL
1.96 SE = 95% CL
2.58 SE = 99% CL
Calculate CI = mean score +/- z-score (which is usually 1.96, or rounded to 2) x SE
More clearly: CI = mean +/- 1.96 (SE)

Calculation for SE

Study These Flashcards

s.d./√N)

Calculate CI

Study These Flashcards

mean +/- 1.96 * (SE)

What influences confidence intervals

Study These Flashcards

The width of a confidence interval depends on three things
 / confidence level: The confidence level can be raised (e.g., to 99%) or lowered (e.g., to 90%).

N: We have more confidence in larger sample sizes so as N increases, the interval decreases

Variation: more variation = more error
For proportions, % agree closer to 50%
For means, higher standard deviations

Hypothesis

A prediction about the relationship between 2 variables that asserts that changes in the measure of an independent variable will correspond to changes in the measure of a dependent variable

Research vs. Null hypotheses

Research hypothesis H1 Typically predicts relationships or “differences” Null hypothesis Ho Predicts “no relationship” or “no difference” Can usually create by inserting “not” into a correctly worded research hypothesis In Science, we test the null hypothesis! Assuming there really is “no difference” in the population, what are the odds of obtaining our particular sample finding?

DIRECTIONAL VS. NONDIRECTIONAL HYPOTHESES

Non-directional research hypothesis “There was an effect” “There is a difference” Directional research hypothesis Specifies the direction of the difference (greater or smaller) from the Ho

Biostatistics Flashcards

(27 cards)