Biostatistics Flashcards

1
Q

Sampling

A

The process to determine who we are going to study/examine.
Purpose: To find out information without talking to everyone.

Two types of sampling
Nonprobability

Probability
Used most frequently in quantitative research
Systematic technique is used to select respondents – goal is to create a sample as representative of the population as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Nonprobability Sampling

A

Less generalizability; problem with representativeness.

Lower confidence in findings.

Useful when probability sampling can’t be used.

Four common methods…
Purposive, convenience, snowball, quota

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Probability Sampling

A

Use to generalize to population at large

Works toward representativeness

Used in all large-scale surveys/observational studies

Avoids sampling bias – selecting atypical folks.
Numerous ways to introduce bias into your sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Representative

A
Your sample is like the population
Random selection!  
All members have an equal chance of being selected…
EPSEM
Equal Probability of Selection Method
Probability samples are never perfect
More representative than non-probability
Probability theory allows us to estimate accuracy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Element

A

Individual members of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Population

A

The entire set of elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sampling frame

A

List of all the elements in a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Parameter

A

Summary of a given variable in a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Statistic

A

Summary of a given variable in a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sampling distribution

A

All the possible random samples that could be selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Simple Random Sample

A
Base of sampling
Need a list (sampling frame)
Assign a number
Select by a random number
Random number list
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Systematic Sampling

A
Determine number needed
Divide population by sample number desired (we call this our sampling interval, denoted here by ‘k’)
List and number our elements
Randomly select start point
Select every k-th elements within groups
Caution: avoid periodicity!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Stratified Sampling

A

Possible modification of previous techniques
Random sample from sub populations
Betters representativeness
Decreases some sampling error
Homogenous subsetscertain number of elements within subsets
Allows oversampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cluster Sampling

A

More complex methodologically (not conceptually, I hope)
Cluster = Groups of elements
Multi-stage
Basic stages/steps: listing and sampling
Helps with cost and dispersed populations
Increases sampling error potential
two samples – double the error opportunity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Comparability (of control & exp groups)

A
Randomization
Recruited folks (who may have been selected using nonprobability sampling techniques) are randomly placed into control and exp. groups.
Matching
 Assign people to group based on characteristics so groups match.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Sampling Error

A

Variation in values of your sample mean compared to the population mean
Because of sampling error, we probably won’t always have completely accurate estimates
Deviation between sample results and population
Reduce by:
Increase sample size
Increase homogeneity

17
Q

THE NORMAL CURVE

A

Characteristics (from central limit theorem):
Theoretical distribution of scores
Perfectly symmetrical
Bell-shaped
Unimodal
Tails extend infinitely in both directions
Mean, median, and mode are equal

NOTE: CENTRAL TENDENCY AND DISPERSION OR VARIABILITY.

Assumption of normality of a given empirical distribution makes it possible to describe this “real-world” distribution based on what we know about the (theoretical) normal curve
We use this assumption to generalize sample findings to a population

.68 of area under the curve (.34 on each side of mean) falls within 1 standard deviation (s) of the mean
In other words, 68% of cases fall within +/- 1 s
About 95% of cases/values fall within 2 s’s
About 99% of cases fall within 3 s’s

18
Q

The z-distribution

A

Just a special case of the normal dist.
Idealized mean of 0 and s.d. of 1
Allows us to use a corresponding z-table to look up critical values

Common critical z-scores (set by conf. level – see next slide):

  1. 65 = 90% CL
  2. 96 SE = 95% CL
  3. 58 SE = 99% CL
19
Q

Confidence level

A

(also called significance level)
Probability our sample statistics fall within a given confidence interval.
We set this ahead of time and denote as alpha (α). Most frequently, it’s α = .05 (95%).

20
Q

Confidence interval

A

Range within ‘true’ parameters should lie, a range of values around the estimate (point estimate)
Upper and lower limit for the confidence level
Many of the biomedical books use CI = mean +/- 1.96(standard errors), but this assumes a 95% confidence level (that’s where they are getting the z-score of +/-1.96).

21
Q

Setting up a CI

A
  1. You need to set a level of confidence ( alpha). Often, as we said,  = .05, or 95% confidence.
  2. Calculate the mean (or proportion). On exams, this will likely be provided.
  3. Calculate the standard error, also usually provided. (if not, they will give a standard deviation. Calculation for SE = s.d./√N)
  4. Based on #1, we will know how many standard errors to use – (precision on this from a z-scores table).
    1.65 = 90% CL
    1.96 SE = 95% CL
    2.58 SE = 99% CL
    Calculate CI = mean score +/- z-score (which is usually 1.96, or rounded to 2) x SE
    More clearly: CI = mean +/- 1.96 (SE)
22
Q

Calculation for SE

A

s.d./√N)

23
Q

Calculate CI

A

mean +/- 1.96 * (SE)

24
Q

What influences confidence intervals

A

The width of a confidence interval depends on three things
 / confidence level: The confidence level can be raised (e.g., to 99%) or lowered (e.g., to 90%).

N: We have more confidence in larger sample sizes so as N increases, the interval decreases

Variation: more variation = more error
For proportions, % agree closer to 50%
For means, higher standard deviations

25
Q

Hypothesis

A

A prediction about the relationship between 2 variables that asserts that changes in the measure of an independent variable will correspond to changes in the measure of a dependent variable

26
Q

Research vs. Null hypotheses

A

Research hypothesis
H1
Typically predicts relationships or “differences”
Null hypothesis
Ho
Predicts “no relationship” or “no difference”
Can usually create by inserting “not” into a correctly worded research hypothesis
In Science, we test the null hypothesis!
Assuming there really is “no difference” in the population, what are the odds of obtaining our particular sample finding?

27
Q

DIRECTIONAL VS. NONDIRECTIONAL HYPOTHESES

A

Non-directional research hypothesis
“There was an effect”
“There is a difference”

Directional research hypothesis
Specifies the direction of the difference (greater or smaller) from the Ho