UNIT 3 SAMPLING AND EXPERIMENTS AND STUFF Flashcards
How can you estimate the probability of an event occurring?
run a simulation. Find the percent of trials that you observed the event occur.
How many trials should you run to have an accurate simulation?
At least 20-30.
Is it always better to do a census or a sample?
It depends. generally, it is better to do a sample since a census is expensive to execute, and because populataions are always changing it is hardly more accurate then a sample. BUT?. For small populations, a census is fine.
To make a survey to tell of a restaurant is good, would you ask the people coming out of the restaurant?
People at the restaurant are probably there because they already like it. If you asked the question “Is this your first time dining here?” and if they say “yes” you survey them, that would be a better method. But then again.. the people wouldn’t go into an Italian restaurant if they didn’t like that type of food.
What are humans bad at ?
Humans are bad at generating random numbers.
How can you “randomly ask 5 people” in the hall a question?
You could roll dice, and if it says 5.. Then the fifth person to walk by you interview, then roll again.. Repeating this is random, but remember that the sampling frame is just the people who walk down that particular hallway?
Sampling frame.
The group who you may sample from. If you do a phone survey then your sampling frame is only people with phones. If you are interested in everyone, a phone survey could suffer from undercoverage?
What are the 3 ways we used random numbers?
- To simulate the likelihood of an event occurring. (ch 11) 2. To choose a sample that is representative of the population and avoid bias.(Ch 12) 3. To assign subjects (experimental units) to treatments to evenly distribute variability and help reduce possible confounding variables.(Ch 13)
what is a simulation?
Basically a test based on reality with a sequence of random outcomes that model it. Like an imitation.
when does a trial of a simulation end?
Generally there are two cases:1. You want to know the probability of having x successes in n attempts (getting 3 smokers in a group of 5 students). Trials end when you get to n (get to 5 students). You record the number of smokers for each trial.2. You want to know how many attempts it takes to get f successes. Trials end when you get f successes. Record the number of attempts.
You want to simulate the likelihood of more than 4 psychology majors being on a full bus that seats 30. 1 in 9 students are psych majors.
use single digits on a random number table. Each digit represents a student on the bus. Ignore the zeros. Let 1 be a psych major, and 2 through 9 be other students. Trials end when you have reached 30 students. Count the number of psych majors (ones) in the trial. Record this. Do this 20 times. Find the percent of times there were 4 or more psych majors on the “bus.” If this occured in 5 trials.. then the likelihood is 5 in 20, or 25%
How do you use a table of random digits?
FIRST.. Make a key showing what the digits represent, whether you will use single, double or triple digits, and which, if any will be ignored? SECOND.. Decide when a trial will end (after 12 events, or after 12 successes), THIRD.. Make sure to clearly label the successes and where the trials end.
When would you use two digits instead of a single on a random number table?
When the percent is not a multiple of ten.. Like “18% ofdogs eat underwear”.. You’ll have to assign 01-18, or 00-17 as undie eating dogs. All other digit pairs will be non-underwear eating dogs.
When can you use single digits for simulations?
When the percent is a multiple of ten, like “30% of teachers secretly twerk”, then you would assign 1-3 or 0-2 as twerking teachers. Or to simulate rolling dice (1-6 faces, ignore 0, 7, 8, 9), or flipping coin (odds H, evens T)
Use the following words in one run on sentence: inference, sample, statistic, parameter, population, census, data
I was curious about a population parameter, but a census was too costly, so I collected data from a sample, calculated a statistic and used that to make an inference about the parameter of interest.
What is random sampling?
When we use chance to select a sample, when you use an actual randomizing mechanism.. Not your “random” guess! When you “randomly” do something in your head, it is not random. roll dice, shuffle cards, pick from a hat or use a random number table or calculator
How can you simulate a coin flip with random number table?
Assign heads to odd numbers and tails to even numbers.
How can you simulate rolling 1 die with a random number table
use only the digits 1-6, ignore 0, 7, 8, 9
How can you simulate on your calculator
RANDINT( lowest, highest, how many you want to grab)
Samplin Method Types?
SRS, stratified, clustered, systematic, multistage, convenience, voluntary
What are the two types of observatinal studies?
Retrospective, and Prospective
What do observational studies and experiments have in common?
In both, you are making OBSERVATIONS.. recording data… doing statistical analysis…
What is a mutlistage sample?
A sample that combines several sampling methods
What is a quality of SRS that is not a quality of Systematic, Stratified or Clustering?
In an SRS, all groups are possible, and ALL POSSIBLE GROUPS have the same chance of being picked. The other methods have lots of “impossible groups” SRS has no impossible groups.-Stratified- an impossible group would be all girls (you’re taking some boys and girls)-Clustered- an impossible group would be all girls (each cluster has boys and girls)-systematic- an impossible group would be 4 people that are right next to eachothe (you are taking every nth person)
What is a simple random sample?
A sample where every possible group has the same chance of becoming a part of a sample.
What is difference between subject and experimental unit?
Humans who are experimented on are commonly called subjects in an experiment. Subjects like dogs, days, plants and anything not human are called Experimental Units
What is prospective study?
Prosepctive study is when you study the experimental unit’s present and futrue response variable.
What is response bias? How do you avoid it?
Response bias is any influence that may sway the respondent to give a more favorable answer e.g wording of the question, interviewer’s behavior/background. Therefore, in a survey, ask questions that allow respondents to answer comfortably and honestly. Keep the wording “indifferent” or neutral in some way in order to unduly favor one response over another.
What is retrospective study?
A retrospective study is a study that looks backwards in time. They focus on estimating differences between groups or variable association because they are not based on random samples.
What is sampling error?
IT IS NOT A MISTAKE!!!… Because the data in samples are generally different, the statistics calculated from one sample to another vary and are generally not equal to the parameter. This variablilty of the STATISTICS is called sampling error. (not the variability of the data).
What is sample size and how does it compare with the fraction of a population?
Sample size is the number of individuals in a sample. The sample size determines how well the sample represents the population, not a fraction of the population sampled. The fraction of the population that you’ve sampled doesnt matter. Its the sample size its self thats most important.
What is statistically significant?
When an observed difference is too large for us to believe that it is likely to have occurred naturally (or just randomly). Basically it is Statistically Significant when we don’t think it happened randomly
What is systematic sampling?
Systematic Sampling is one of four different ways to make a survery sample random. Systematic sampling includes picking every Nth number of what you are sampling (for example people.). You must still start on a random person and then from then on take every Nth person. So you can take every 10th person in a line in order to take a survey as long as you also start on a random individual.
What is the difference between a cluster sample and random sample?
A cluster sample is when the population is first divided into sections of clusters that have traits similar to the population (the clusters are heterogeneous and have all types within them). Then we randomly select an entire cluster or clusters, and include all of the members of the clusters in the sample. As for random sample is when each member of the population, and each possible group is equally likely to be included.
What is the difference between response bias and nonresponse bias?
Response is when the person’s response is influenced by the question or questioning method (like if a parent asks if you use drugs, as opposed to a friend… there is only one answer to this, but one might respond differently to them), non response is is when the people who don’t respond might have different opinions/views than the people who did.
What is the problem with convenient sampling?
The sample may not be representative as it is not randomized to include every type of person. E.G Friends and family are convenient but they likely share similar opinions and thus the sample is not representative of a population.
What is the standard sampling method?
A Simple Random Sample (SRS) is our standard. Every possible group of n individuals has an equal chance of being our sample. That’s what makes it simple.
What is undercoverage?
Undercoverage is when either one part of the population is not included in a survey or is underrepresented in the survey
What is wrong with using volunteers in a survey?
Those who volunteer may not be like the rest of the population. An example may be, if you’re trying to find our how often people volunteer for things. So you ask for volunteers to take the survey…. A question may be “when was the last time you volunteered for something?” Well. they all just volunteered for the survey!
What is wrong with using voluteers in an experiment?
Not much. In an experiment, we are not looking for a sample that is like the population… We just want to see the effectiveness of a treatment. It is fine if the subjects are all similar. In fact it is best sometimes when they are!
What type of study would find relationship beween Verbal and Math SAT?
You could take all of the SAT Math and Verbal scores and run a regression and find the r-quared value and linear model. This would be a Retrospective Study.
What’s the difference between a prospective and a retrospective study?
A retrospective study takes a group and looks back at its history while a prospective study watches a group for a period of time and records the data. RETRO-REVERSE, PROspective- PResent and On..
What’s the difference between cluster and stratified?
Stratified- you divide the population up into groups according to traits, called strata (groups with similar traits- homogeneous groups) and randomly choose from each strata.
Cluster- grab clusters of the population.. each cluster should be like the population.
What’s the difference between lurking and confounding?
Lurking varibles, on one hand, infer the assoiation between the two varibles; confounding variables, on the other hand, make it unclear which variable has had an impact on which in an experiment.
what’s the difference between response bias and nonresponse bias?
response bias is anything in a survey design that influences responses falls under the heading of response bias (wording of questions). Nonresponse bias is bias introduced to a sample when a large fraction of those sampled fails to respond.those who respond are likely to not represent the entire sample. Will you please take a survey? .. NO !
Why do you have to Stratify?
You don’t have to.. But you might want to if you feel that a simple random sample might not be representative of the population . You want your sample to be like the population.. a representative sample (it represents the population well).
How are voluntary and convenience samples similar?
With voluntary, people choose them selves, with covenience, the people are just chosen by researcher, neither uses randomness and both are prone to BIAS.
How can the WORDING of the question lead to response bias
Words or phrases that impact your feelings tend to influence responses. Look for “devastating, horrific, wonderful? etc.” Sometimes there is a background story like “Many americans lose jobs to illegal aliens every year?? “
Can you stratify in an experiment?
NO. stratification is a sampling method, blocking is method used in experiments. They are similar ideas.
explain CONTROL
one of the principles would be the control, which are the factors that the experimentors keep constant in each trial because they believe it would effect the outcome of the experiment. Also having a group that is not getting treatment helps to control because it measures the effects of the natural environment.
Explain two types of experimental design.
1.)Randomized Block Design: randomization occurs within the blocks only. 2.) Completely Randomized Design: all of the experimental units have the same chance at recieving a treatment.
How is Blocking in an Experiment Similar to Stratefying in a Sample?
The two are similar because they divide the subjects into homogenous groups where the subjects are all similar
What is common mistake when using the term BLOCKING?
Students often will report that they “blocked according to exercise” or “blocked according to type of fertilizer”.. These things are treatments.. We don’t block by treatments. We block by things that are ALREADY PRESENT BEFORE WE BEGIN EXPERIMENT.. Like by gender, or dog type or how close the plants are to a window.
How is clustering and stratifying different when doing a sample?
Clustering is when chosen at random a group from the population that looks like the population, clusters should be heterogenous. While Stratifying is slicing a population into homogeneous groups(strata). Then randomly sample within each stratum before the results are combined.
What four things do you need in an experimental design? (trick)
NEED only 3: control , randomization, replication.. BUT? Use blocking when appropriate
What is a control group?
A group in an experiment without the treatment that is compared to groups with treatments to make results or conclusions. The control group helps us see what would happen anyway… without any treatment so that we can see the true effect of the treatment.
What is a factor?
A variable in an experiment that the experimenter manipulates. (factors have levels.. )
What is a level in an experiment?
A level is a specific value(s) that the experimenter chose for a factor that is manipulated.ex. Factor is sleep, level(s) would be how many hours the subjects were aloud to sleep.. 4 hours, 6 hours, 8 hours.. 3 levels
What is bias?What are some common errors?
It’s any systematic failure of a sampling method. COMMON ERRORS: Voluntary response, undercoverage of the population, nonresponse bias and response bias. We use randomness and methods like stratifying to reduce these.
What is Placebo used for?
Placebo is used for control in an experiment. the purpose of placebo is to determine the change between the controlled treatment and the other treatments
what is the best way to reduce bias?
randomness. sophisticated answer: make as many things as random as possible
What is the difference between a study and an experiment?
In a study you are basically just watching and in an experiment you are manipulating factors and (hopefully randomly) assigning treatments
What is the difference between confounding and lurking?
Confounding is to experiment, we may think a treatment works when it was really the environment (like sunlight on plant growth…. we then block by proximity to window. to remove that confounding variable). .Lurking is to sample, y and x makes it appear that x may be causing y, like ice cream sales and surfing accidents.
What is the difference between subjects in experiments and subjects in sample surveys?
Samples for surveys try to represent the entire population of interest and often experimental units are all the same type of tomato because we want to just look at impact of treatments.
What is the difference between single-blind and double blind?
Single blinding is when all individuals in either one of the classes are blinded; double-blinded is when everyone in BOTH classes are blinded. Classes are: subjects, treatment givers, evaluators?
What are the two blinding groups?
Group one: subjects and the people giving subjects the treatment.
Group two: those people assessing the groups to compare results.
What is the main purpose of a placebo ?
To blind the subject that is being experimented on to avoid influence to the given variable therefore altering the response variable . When people think they’re getting help, they often improve anyway..
What is the placebo effect?
When those who get the placebo show improvements, or show the effects of the treatment. This often happens to up 20% of participants!
What is the purpose of matching?
Matching, like blocking, reduces unwanted variation. In a retrospective or prospective study, subjects who are similar in ways not under study may be matched and then compared with each other on the variables of intrest.
What is the sure way to assign treatments correctly?
throw names in hat and pick
What’s a useful alternative when you can’t run an experiment? What are they useful forms of this, and how do you preform them respectively?
An alternative of an experiments could be an observational study. A prospective observational study is when you identify subjects in advance and record data as you go along. A retrospective observational study is when you analyze observations from the past.
Who can be blinded?
Subjects and Those delivering treatments. Those assessing effectiveness of treatments. and three mice.
Why do you have to block?
You don’t have to.. But you might want to if you feel that the experimental units (subjects) may respond differently to the treatment.
Why does it make sense to double-blind an experiment?
It reduces bias in an experiment. If subjects don’t know what treatment they’re receiving, they won’t change their habits based on that knowledge. If evaluators don’t know which treatment each subject is receiving, they won’t bias the true results based on the results they expect to see
Why randomize in an experiment?
To avoid bias. An experimenter might want their treatment to work, so may chose the subjects that might respond best.
what is completely randomized?
all subjects names in a hat and pick
what is randomized block
separate subjects into blocks (cats here.. Dogs here? rabbits here..) then put dog names in dog hat and choose for treatments.. And same with others, therefore each block will get all of the treatments.