Stat 354 Flashcards

1
Q

sampling theory vs. classical statistical theory

A
  • concerned w/ finite populations
  • different goals and restrictions
  • no density function, limited use of models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If N = n

A

complete enumeration

census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

why survey? (survey vs census)

A
time
cost
speed
scope
accuracy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Principle steps for surveying

A
Objectives
Resources
Population
Units of observation
Data to collect
Method of measurement
organization of field work
summary and analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

steps for surveying, Objectives

A

precise statement of objectives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

steps for surveying, resources

A

quantity of information “purchased” , cost of information for whole survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

resources (quantity) depend on

A

number of observations made (items sampled)

design of survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Determining/setting resources

A

determine sample design to obtain:

  • most information (lowest SE) for a given budget
  • most observations/cost for a given level of precision (SE)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If resources can not meet the objective

A

do not survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Target population

A

population of interest

collection of elements about which we wish to make inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Element

A

object from which we take a measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Target population example

A

collection of voters in a community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Element example

A

a registred voter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sample population

A

population sampled from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Discussing the target population

A

be aware of assumptions made to make the leap from sample population to target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Example of sample population

A

collection of registered* voters in a community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

observational unit

A

element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

sampling unit

A

unit selected for a sample

  • may contain 1+ observational units
  • non-overlapping collection of elements from the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

sampling unit example

A

a classroom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

observational unit example

A

a student in a classroom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

sampling frame

A

list of all sampling units in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

sampling frame example

A

list of all students in the school

list of all registered voters in the community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

reduced data quality

A

if you ask too many questions

-focus questions, be concise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

measurement methods

A

self-administered questionaires

telephone, email, door-to-door, internet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

very important step in methods

A

test questionare on small-scale - pilot study, pre-test

improve and re-assess

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

steps for surveying, organization of field work

A
  • train people in goals and methods
  • early quality checking
  • plan for non-response
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

steps for surveying, summary and analysis

A
  • edit questionnaire, record errors
  • methods for handling non-response
  • different estimation methods
  • estimation of precision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Non-response

A

some elements of sample fail to provide responses to survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Non-response bias

A

if non-responders have differing opinions/ measurement from responders, bias occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

non-response bias especially important when

A

non-response rate is high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

selection bias

A

some units more likely to be included in sample than other

-cannot be overcome by increased n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

sample

A

collection of sampling units drawn from sampling frame (single or multiple frames)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Literary digest poll, 1936

A

predicted 57% for Landon
highest response in history, 2.4million
Roosevelt won 62%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

why did Literary digest fail

A

SRS from phone book and club membership – selection bias (only rich 1/4 of pop. had phones)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

what to learn from Literary digest poll

A

when selection procedure is biased, no size of n will help

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

personal vs mailed surveys

A

personal ca. 65%

mailed ca. 25%

37
Q

how to find out if a sample is any good

A

ask how it was taken

38
Q

Gallup poll, 1936

A

George Gallup
n = 50,000ppl
predicted Roosevelt victory (56% vs truth 62%)
predicted Digest results (44% vs truth 43%)

39
Q

Quota sampling

A
  • interviewer assigned fixed number (quota) of subjects to interview
  • # s w/i categories are fixed
40
Q

example of quota categories

A

residence
age
sex
economic status

41
Q

goal of quota sampling

A

aims to be representative based on census data

ex. design sampling based on % men vs women in population

42
Q

problems with quota sampling

A
  • while sample controls for certain variables, not the one of interest (ex. can’t control of republican vs democratic)
  • interviewers are free to choose who they want within quota
43
Q

sources of error in surveys

A

Errors of non-observation

Errors of observation

44
Q

Error of non-observation

A

sampling error
coverage error
non-response

45
Q

sampling error

A

deviation between sample estimate and true population value

46
Q

coverage error

A

sampling frame does not match perfectly w/ target population

47
Q

errors of observation

A

interviewers

respondents

48
Q

Interviewer error

A

effect response of respondent in some way

49
Q

example of interviewer error

A

body language

50
Q

how to reduce sampling error

A
  • sampling design
  • sample size
  • investigator
51
Q

coverage error example

A

people who are unlisted in telephone book

52
Q

Respondent error

A

differ in their ability and motivation to answer correctly

-response error

53
Q

Response errors

A

recall bias
prestige bias
intentional deception
incorrect measurement

54
Q

Recall bias

A

different responders recall differently

55
Q

prestige bias

A

exaggerate to appear more prestigious

56
Q

example of prestige bias

A

exaggerate income

57
Q

Intentional deception example

A

don’t want to admit to breaking the law

58
Q

incorrect measurement

A

respondent doesn’t understand measurement units

ex. report on cm vs m; cups of coffee vs travel mugs

59
Q

how to reduce non-response in data collection

A

reward for responding
inform ahead of time
shortened, concise, focused questionnaire
callback, persistence
marketing - train interviewers to ‘sell it’
data cleaning - check for errors

60
Q

sampling distribution of ȳ

A

distribution of values of ȳ over repeated samples of same size

61
Q

characteristics of ȳ sampling distribution

A
  • mean = µ
  • standard deviation σ/n
  • approximately bell-shaped
  • assumes population is infinite
62
Q

sampling distribution if n is too big

A

shorter tails than normal
truncated
non-normal

63
Q

covariance

A

large | Cov(y1, y2) | = greater dependence btw y1, y2
depends on scale of measurement (units)
standardize by correlation

64
Q

SRSWR

A

n independent samples of size 1

may include duplicates

65
Q

SRSWOR

A

every possible subset of n from N equally likely to be chosen

66
Q

what is the probability of selecting an individual sample in SRSWOR

A

1/ (N choose n)

67
Q

N choose n

A

(N!) / n!(N-n)!

68
Q

n!

A

product of all positive integers less than or equal to n

ex. 5 ! = 5 × 4 × 3 × 2 × 1 = 120

69
Q

what is the probability that the ith unit is in the sample (πi)?

A

n/N

P(ith unit in sample) = n/N = πi

70
Q

πi =

A

samples that contain i / total number of possible samples

71
Q

ways to draw a SRS

A
  • haphazard sampling
  • list all (N choose n) subsets, choose at random
  • random number generator
  • blind sampling
  • draw elements at random, include if not duplicates
72
Q

haphazard sampling

A

using own judgement to draw a sample

≠ random sample

73
Q

fpc

A

finite population correction

1 - (n/N)

74
Q

when N is large, fps is

A

ca. 1

1 - (n/N) = 1 - (ca. 0)

75
Q

CLT for SRSWOR

A

n –> N –> ∞
n/N –> C less than 1
n, N, N-n must be ‘sufficiently large’
n ≥ 50 usually ok

76
Q

in experimental design, what is used to reduce variability

A

blocking (analogous to stratification)

77
Q

strata

A

division of population into a number of non-overlapping groups

78
Q

stratified random sample

A

SRS drawn from each stratum

79
Q

advantages of stratification

A
  • if different means in sub pop.’s may be more precise
  • administrative advantages
  • can obtain separate estimates of each parameter for each strata
80
Q

ai

A

proportion sampled in each stratum

81
Q

how do we decide ai

A

small variance

lowest cost

82
Q

Best allocation is affected by

A

Ni (# of elements in each stratum)
Si^2 (variability in each stratum)
Cost of obtaining an observation in each stratum

83
Q

How do factors that affect allocation impact sample size

A

larger sample sizes to strata w/ larger pop.’s
larger sample sizes to strata w/ larger variability
smaller sample sizes if costs are high

84
Q

Types of allocation models

A

Optimal allocation
Neyman allocation
Proportional allocation

85
Q

Optimal allocation

A

most information for least cost
choose ni to minimize V(yst) for a fixed C or minimize C for a fixed V(yst)
C = Co + E cini

86
Q

Neyman allocation

A

special case of optimal allocation

used when costs are equal in all strata

87
Q

Proportional allocation

A

split sample into strata w/ same proportion as population
ni/n = Ni/N
the stratified estimator (yst) is the average of all observations

88
Q

rounding rules

A

always round up for n, except for optimal allocation (don’t cross budget)