UNIT 3 - SAMPLES and EXPERIMENTS Flashcards
A 4 year high school of 2000 students, sampling 40 high students: Describe a simple random sample
Number students 1-2000. Use rantom number generator to get 40 unique integers from 1 to 2000.
A 4 year high school of 2000 students, sampling 40 high students: Describe a stratified sample
Stratify by year. Randomly choose 10 FR, 10 SO, 10 JU and 10 SENIORS
A 4 year high school of 2000 students, sampling 40 high students: Describe a convenience sample
Ask the first 40 students coming to the locker rooms after school. This is problematic because athletes may not have the same preferences as non athletes.
A 4 year high school of 2000 students, sampling 40 high students : Describe a systematic sample
Get an alphabetical list of all of the students, 2000/40=50. Randomly choose one of the first 50 students and then every 50th student after that.
A 4 year high school of 2000 students, sampling 40 high students: Describe a cluster sample
Imagine that all of art classes have 10 students and they are mixed with fr, so, jr and srs… You would randomly choose 4 classes and survey everyone in each of the 4 classes.
What is a flaw of SRS that is not a flaw of others?
You could get any sample group with an SRS. You could sample a high school and just randomly get a sample of just male juniors. While it is not likely, it could happen. All groups are possible, and equally likely. We stratify to prevent this from happening.
A 4 year high school of 2000 students, sampling 40 high students: Since ALL GROUPS (samples) are possible and equally likely, show some groups that you could get randomly from and SRS that would not be representative of the entire school.
all female, all freshmen, all seniors, all athletes.. these could happen in an SRS (but they are not likely to)
A 4 year high school of 2000 students, sampling 40 high students: Explain how stratifying has “impossible groups”
You couldn’t get all freshmen in your sample
A 4 year high school of 2000 students, sampling 40 high students: Explain how clustering has “impossible groups”
You couldn’t get 2 people from each classroom, because you would be randomly choosing classrooms and asking everyone in those classes.
A 4 year high school of 2000 students, sampling 40 high students: Explain how systematic has “impossible groups”
You couldn’t get the first 40 people alphabetically (because you are taking every nth)
What is the standard sampling method?
A Simple Random Sample (SRS) is our standard. Every possible group of n individuals has an equal chance of being our sample. That’s what makes it simple. Put the names in a hat.
give an Example of a MULTISTAGE sample
Suppose you want to poll urban, suburban and rural citizens, you can divide a map into those strata, and then randomly choose neighborhoods or streets in each and ask everyone on those streets. Here you stratified by community type and then clustered by street.
What is a multistage sample?
A sample that combines several sampling methods
What are the two types of observational studies?
Retrospective, and Prospective
What is a simple random sample?
put all of the names in a hat. every group is possible. pull the numbers
What is cluster sampling?
Cluster- grab clusters of the population. each cluster should be representative ( like the population) use a few clusters.
What is retrospective study?
A retrospective study is a study that looks backwards in time (or at the present moment).
What is systematic sampling?
collecting data from every nth subject.
What is prospective study?
Prospective study is when you study the experimental unit’s present and future.
What is a representative sample?
A sample that looks like the population. It has similar characteristics.
What is stratified sampling?
When you break the population into groups with similar attributes and randomly select from each strata.
What are the “good” sampling methods?
SRS (simple random sample), stratified, clustered, systematic, multistage
What are the “bad” sampling methods.
convenience samples, voluntary samples
When your sampling frame is different from the population, then you risk ____
undercoverage
What is wording bias
A type of response bias, When the wording of the question impacts response to it. (type of response bias)
Systematic, how do you find the N for every nth subject, and then how do you proceed?
TOTAL POP/SAMPLE SIZE= your n (round down). Then use RAND INT to Randomly choose first. RANDINT(1, n). And then take every nTH.
What is BIAS in sampling?
A systematic FLAW in your method. Undercoverage, Wording, Volutary, Convenience, Comfort (psychological), Response, Non-response BIAS. Even with a larger sample, you will still have bias.
What’s the difference between a prospective and a retrospective study?
A retrospective study takes a group and looks back at its history while a prospective study watches a group for a period of time and records the data along the way into the future.
What is a weakness of a SRS?
Suppose you want a sample of 50 high school students, with an SRS, although unlikely you could get “all freshmen” which wouldn’t be representative.
Is it always better to do a census or to sample?
depends on the availablility of the data. If the you want to look at SAT vs GPA, you may easily be able to get all of the school’s data and do that study (a census). If you have to go out and get the info, you may want to take a sample to save time and energy.
What is a sampling frame?
It is the frame from which you get your sample. For instance, if you call people the frame would be “people with phones,” if FOX news takes a poll, the sampling frame is “fox news watchers”
When sampling, what kind of sample are we striving to get?
A representative sample, we want our sample to have similar charactaristics as the population
To make a survey to tell of a restaurant is good, would you ask the people coming out of the restaurant?
People at the restaurant are probably there because they already like it. If you asked the question “Is this your first time dining here?” and if they say “yes” you survey them, that would be a better method. But then again.. the people wouldn’t go into an Italian restaurant if they didn’t like that type of food.
What is undercoverage?
Undercoverage is when a group of the population is not represented in the sample. When the sampling frame isn’t representative.
What is response bias? How do you avoid it?
Response bias is any influence that may sway the respondent e.g wording of the question, interviewer’s behavior/background. Therefore, in a survey, ask questions that allow respondents to answer comfortably and honestly. Keep the wording “indifferent” or neutral in some way in order to unduly favor one response over another.
Things that cause nonresponse bias ?
(remember non response is that the people you ask, or try to ask don’t respond) Lazy researcher, shy survey takers, who is the questioner, environment,
What is a quality of SRS that is not a quality of Systematic, Stratified or Clustering?
In an SRS, all groups are possible, and ALL POSSIBLE GROUPS have the same chance of being picked (like all senior male students.).The other methods have lots of impossible groups. SRS has no impossible groups. Example: -Stratified- an impossible group would be all girls (you’re taking some boys and girls)-Clustered- an impossible group would be all girls (each cluster has boys and girls)-systematic- an impossible group would be first 10 people that are right next to each other (you are taking every nth person, so you will skip)
What is statistically significant?
When an observed difference is too odd for us to believe that it is likely to have occurred naturally (or just randomly). Basically it is Statistically Significant when we don’t think it happened randomly. when you think “something’s up” or “something’s fishy”
How can you use random numbers to sample?
Number the subjects 00-99 (if less than 100) or 000-999 (if less than 1000) or 0000 to 9999 etc.. then use a random number table taking one, two, three or four numbers at a time. Throw out repeats.
In which sampling methods do the subjects have equal chances of being selected?
SRS, Stratified, Clustered, Systematic, and multistage. In all of these, the subjects have an equal chance (but groups have different chances)
How are we proving causation in experiments and obs studies?
No causation in a study, maybe association or correlation. ONLY IN EXPERIMENTS TO YOU TALK ABOUT CAUSALITY.
Name types of bias
undercoverage, non response, response, voluntary
What is the placebo effect?
When those who get the placebo show improvements, or show the effects of the treatment. This often happens to up 20% of participants!
what is the best way to reduce bias?
randomness and good sampling methods.
What is sampling error?
How far your statistic is from the parameter (how far your calculation from your sample was from the population parameter)
Suppose you want to see the relationship between gender and candy preference in squirrels. How may you do a stratified vs cluster sample
STRATIFIED: You can split the list of all of the squirrels in your neihborhood by gender and randomly select 20 males from th list of all of the males, and then 20 females (strata) from all of the females. CLUSTER: you can randomly choose to 5 different trees and survey all of the squirrels in those trees, assuming that there are 4 squirrels living in each tree (clusters, the trees have both M and F).