UNIT 3 and UNIT 4 COMBO Flashcards

1
Q

when does a trial of a simulation end?

A

Generally there are two cases:

  1. You want to know the probability of having x successes in n attempts (getting 3 smokers in a group of 5 students). Trials end when you get to n (get to 5 students). You record the number of smokers for each trial.
  2. You want to know how many attempts it takes to get f successes. Trials end when you get f successes. Record the number of attempts.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of probability when you are looking for at least one success in twelve attempts?

A

1 - p(NONE)

or “not zero”

1-binopdf(12, p, 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is prospective study?

A

Prosepctive study is when you study the experimental unit’s present and futrue response variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

RAND VARIABLE:

X has mean y and standard deviation of z.

A has mean b and standard deviation c.

Find: Mean, SD and VAR of: 5A

A

mean: 5b
sd: 5c
var: 25c2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what’s the difference between response bias and nonresponse bias?

A

response bias is anything in a survey design that influences responses falls under the heading of response bias (wording of questions).

Nonresponse bias is bias introduced to a sample when a large fraction of those sampled fails to respond.those who respond are likely to not represent the entire sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the law of large numbers?

A

states that in the long run.. (NOT SHORT RUN)

The relative frequency settles down to true probability.

(you’ll have 50% heads after an infinite number of coin flips with a fair coin).

Don’t make short run predictions.. coins don’t owe you “tails”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are humans bad at ?

A

Humans are bad at generating random numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What’s the difference between lurking and confounding?

A

Lurking varibles, on one hand, infer the association between the two varibles;

confounding variables, on the other hand, make it unclear which variable has had an impact on which in an experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How many successes can you expect when you know p? (mean of binormial)

A

np.

Makes sense, if 30% like butter, out of 50 people you would expect (50)(.3)= 15 to like butter

np is the mean of the binomial distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between a study and an experiment?

A

In a study you are basically just watching.

In an experiment you are manipulating factors and (hopefully randomly) assigning treatments.

Sometimes people call an experiment a study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is probability?

A

THE LONG RUN RELATIVE FREQUENCY!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How are voluntary and convenience samples similar?

A

With voluntary, people choose them selves,

with covenience, the people are just chosen by researchers without using a random method, neither uses randomness and both are prone to BIAS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why blind the treatment givers?

A

The treatment givers may behave differently as they administer the actual meds vs when they administer the placebo.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a control group?

A

A group in an experiment without the treatment that is compared to groups with treatments to make results or conclusions.

The control group helps us see what would happen anyway. without any treatment so that we can see the true effect of the treatment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Give examples of when you would block

A

Looking to see impact of different leather preservers on chairs in an airport. You might block according to proximity to window, or proximity to main entrance. The window seats will get more light and the ones closest to entrance may get more use, they will age and wear differently so you want to make sure some in each group get the different treatments.

OR, pain medicine. You might block by gender as males and females might react differently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the problem with convenient sampling?

A

The sample may not be representative as it is not randomized to include every type of person.

Friends and family are convenient but they likely share similar opinions and thus the sample is not representative of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Explain two types of experimental design.

A
  1. )Randomized Block Design: randomization occurs within the blocks only. MATCHING IS BLOCKING
  2. ) Completely Randomized Design: all of the experimental units have the same chance at recieving a treatment.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is independent? What are the two equations to test for independence?

A

when P(A)=P(A|B) OR P(A)*P(B)=P(A and B)

When the probability of A is the same even when B is also true.

Knowing B does not affect the probability of A.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do we call it when two things can’t happen at the same time?

A

disjoint OR mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

When to use general MULT and what is it?

A

AND probability. Use when associated.

P(this)*P(that | this).

P(A)*P(B given A)

IT ALWAYS WORKS FOR ALL SITUATIONS.

When indep, the P(that|this) = P(that). So you end up with the simpler independent version, P(this)*P(that)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is sample size and how does it compare with the fraction of a population?

A

Sample size is the number of individuals in a sample. The sample size determines how well the sample represents the population, not a fraction of the population sampled. The fraction of the population that you’ve sampled doesnt matter. Its the sample size its self thats most important.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Can you stratify in an experiment?

A

NO. stratification is a sampling method, blocking is method used in experiments.

They are sort of similar ideas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

RAND VARIABLE:

X has mean y and standard deviation of z.

A has mean b and standard deviation c.

Find: Mean, SD and VAR of: X + 12

A

mean: y+12

SD z

var: z2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

can independent events be disjoint?

EXPLAIN

A

NO, if they are independent, then knowing one doesn’t change the probability of the other, but if they are disjoint, knowing one makes the other impossible, so it does change the probability of the other to 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a factor?

A

A variable in an experiment that the experimenter manipulates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the two types of observatinal studies?

A

Retrospective, and Prospective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is area under ANY probability curve?

A

1 (or 100%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How to make TREES with screening tests????

A

SPLIT UP POPULATION FIRST (by % with condition).

then split the groups by outcomes of the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

what is a complement?

A

the probability that it doesn’t happen.

1-P(it happens).

(together they add to 100%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the main purpose of a placebo ?

A

To blind the subject that is being experimented on to avoid influence to the given variable therefore altering the response variable.

When people think they’re getting help, they often improve anyway..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is a simple random sample? how is it different from others?

A

A sample where every possible sample has the same chance of being selected.

There are no impossible samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is bias?

What are some common errors?

A

It’s any systematic failure of a sampling method.

COMMON ERRORS: Voluntary response, undercoverage of the population, nonresponse bias and response bias. We use randomness and methods like stratifying to reduce these.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How can the WORDING of the question lead to response bias

A

Words or phrases that impact your feelings tend to influence responses. Look for “devastating, horrific, wonderful, etc.”

Sometimes there is a background story like “Many americans lose jobs to illegal aliens every year, how do you feel about the border wall?”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What type of probability when you are looking for the first success on or before the fifth attempt?

A

Geocdf(p, 5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What four things do you need in an experimental design? (trick)

A

NEED only 3: control , randomization, replication..

MAKE SURE YOU COMPARE

Use blocking when appropriate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

When to use general add and what is it?

A

OR probability.

Use when not disjoint. (subtract overlap)

P(this OR that) = P(this)+P(that) - P(this and that)

(IT ALWAYS WORKS IN ALL SITUATIONS, when disjoint, P(this and that)= 0, so you end up with the simpler disjoint version)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Why is it called “binomial”

A

These numbers come from the coefficients of expanded binomials..(x+y)1, (x+y)2, (x+y)3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is diff between 3X and X+ X+ X

(mean and st. dev)

A

3X is just tripling one play. Mult mean and SD by 3.

X+X+X is playing 3 times, Mult mean by 3,

BUT… must add variances, square SD’s add 3 times then sqrt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is the difference between a cluster sample and random sample?

A

A cluster sample is when the population is first divided into sections of clusters that have all of the traits that the population has, so the clusters are representative. You grab a cluster as your sample.

A random sample is all names in a hat so you could get any group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Give three examples of events that are not mutually exclusive (not disjoint)

A
  1. Being a DOG and being SMELLY
  2. Being a FRESHMAN and being FEMALE
  3. Liking ICE CREAM and liking HAMBURGERS(both can be true simultaneously)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

How can you simulate a coin flip with random number table?

A

Assign heads to odd numbers and tails to even numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is a quality of SRS that is not a quality of Systematic, Stratified or Clustering?

A

In an SRS, all samples are possible and all possible samples have the same chance of being picked.

The other methods have lots of “impossible sample groups”

.Stratified- an impossible group would be all girls (you’re taking some boys and girls)-

Clustered- an impossible group would be all girls (each cluster has boys and girls)-

systematic- an impossible group would be 4 people that are right next to eachothe (you are taking every nth person)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

what is the best way to reduce bias?

A

randomness.

sophisticated answer: make as many things as random as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

RAND VARIABLE:

X has mean y and standard deviation of z.

A has mean b and standard deviation c.

Find: Mean, SD and VAR of: 3X + 5A

A

mean: 3y+5b
sd: SQRT(9z2+25c2)
var: 9z2+25c2 (same as (3z)2 + (5z)2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is easy way to assign treatments with random number generator?

A

Assign everyone a 2 digit number (toss out repeaters),

then simply sort from lowest to highest.

The lowest n get treatment 1, next n get treatement 2, next n get treatment 3, etc….

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is the law of averages?

A

a misinterpretation of the law of large numbers.

thinking things will even out..

Using this law, if you flipped 4 heads in a row, you’d expect the next one to be a tails because it should even out in the long run. Not true, 5 flips is not the long run. Infinity is. The next flip still has a 50% chance of being another head. You may hear someone say “he’s do for a hit” or “it’s bound to rain soon” both bad.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

what is n! ?

A

it is “n factorial” example: 5! = 5*4*3*2*1= 120.

tells you how many ways you can arrange n objects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Why do you have to block?

A

You don’t have to,

But you might want to if you feel that the experimental units (subjects) may respond differently to the treatment because of confounding variables.

Like if you were testing out new deoderant. You might want to block according to activity level so you don’t get all of the active people in one group (they sweat more).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is response bias?

How do you avoid it?

A

Response bias is any influence that may sway the respondent to give a more favorable answer e.g wording of the question, interviewer’s behavior/background.

Therefore, in a survey, ask questions that allow respondents to answer comfortably and honestly.

Keep the wording “indifferent” or neutral in some way in order to unduly favor one response over another.

CONTROL the environment so that it is similar for all subjects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What is the “hot hand?”

A

a misinterpretation of the law of large numbers. Using this law, if you flipped 4 tails in a row, you’d expect the next one to be another tails, because tails is “hot.” A baseball player who gets three hits in a row, you expect another hit? wrong. Streaks happen randomly (actually there is a little evidence for hot hand in some sports, but more research needs to be done)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Give three examples of independent variables

A
  1. Being tall and having a high GPA
  2. If it is snowing and whether it is a Thursday or not
  3. Whether a person likes pizza and their gender(notice, knowing one bit of information does not impact the likelihood of the other being true also)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Do we say things are “dependent?”

A

NO! we say associated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

How to find likelihood of being pregnant, given the test says you are? (tree)

A

Split population by %pregnant and %not who take test, then each of those into what test says.

Then look just the groups that the test said pregnant. Then find: %pregnant/(total percent in both groups).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

RAND VARIABLE:

X has mean y and standard deviation of z.

A has mean b and standard deviation c.

Find: Mean, SD and VAR of: X + A

A

mean: y+b

SD SQRT(z2+c2)

var z2+c2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

what does binomial model tell us about?

A

exactly x successes in K trials.

What is likelihood of exactly 3 heads out of 13 flips?

binopdf(13, .5, 3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What is the standard sampling method?

A

A Simple Random Sample (SRS) is our standard. Every possible group of n individuals has an equal chance of being our sample. That’s what makes it simple.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

geopdf (inputs)

A

FIRST SUCCESS ON THIS ATTEMPT

geopdf (p,x)

probability of FIRST SUCCESS being ON the Xth trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What type of probability when you are looking for more than 5 successes in twelve attempts?

A

6 or more

same as not 5 or less

1- (5 or less)

1 - binocdf(12, p, 5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What type of probability when you are looking for exactly 5 or more successes in twelve attempts?

A

(more than 4)

not 4 or less

1-(4 or less)

1 - binocdf(12, p, 4)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What is retrospective study?

A

A retrospective study is a study that looks backwards in time. They focus on estimating differences between groups or variable association because they are not based on random samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

How do you use a table of random digits?

A

FIRST, Make a key to explain what the digits represent, whether you will use single, double or triple digits at a time and which, if any will be ignored.

SECOND.. Decide when a trial will end (after 12 events, or after 12 successes),

THIRD.. Make sure to clearly label the successes and where the trials end.

FOURTH: KNOW YOUR RESPONSE VARIABLE,

like, how many successes in n trials, or how long til n successes…

62
Q

explain CONTROL

A

You want to control as much of the environment as posible so all subjects have a similar experience except for the treatments given

Control the factors in the experiment for each trial, you keep them constant if you believe it would effect the outcome of the experiment

Also having a group that is not getting treatment helps to control because it measures the effects of the natural environment.

63
Q

If combining 4 random variables with standard deviations of m, p, q, r…. what is the new combined standard deviation?

A

SQRT(m2 + p2 + q2 + r2)

64
Q

Give example of confounding variable

A

Activity level could confound results in an anti-persperant deoderant experiment. If only active people used brand X and sedentary people used brand Y, Y might look like it was effective, but the people didn’t sweat because they were sedentary. You would want to block according to lifestyle so that some active people used both types and sedentary used both types.

Sunlight and seat Usage could be confounding variables for a Leather preserver.. If you randomly choose from all chairs in an airport for treatment and brand A randomly has a lot of chairs near the sun, Brand B randomly gets a lot fo chairs near the main entrance and Brand C randomly gets the chairs that don’t have a lot of sun, or a lot of use, you may think that brand C works the best, when in fact, the results were confounded by sunlight and usage..

65
Q

RAND VARIABLE:

X has mean y and standard deviation of z.

A has mean b and standard deviation c.

Find: Mean, SD and VAR of: X + X + X

A

mean: y+y+y

sd SQRT(z2+z2+z2) ….

var (z2+z2+z2)

66
Q

What is wrong with using voluteers in an experiment?

A

Not much. In an experiment, we are not looking for a sample that is like the population. We just want to see the effectiveness of a treatment. It is fine if the subjects are all similar. In fact it is best sometimes when they are!

67
Q

Give three examples of variables that are not independent (associated)

A
  1. Playing video games and gender (Knowing male makes it more likely they play)
  2. Whether it is snowing and the month you are in (some months are more rainy than others, knowing what month changes likelihood of snowing)

3 If a pet is a dog and if it is a cat (knowing it is a dog makes it certain that it is not a cat).(notice, knowing one bit of information changes the likelihood of the other being true also).

68
Q

How can you simulate rolling 1 die with a random number table

A

use only the digits 1-6, ignore 0, 7, 8, 9

69
Q

What type of probability when you are looking for less than 5 successes in twelve attempts?

A

same as 4 or less

binocdf(12, p, 4)

70
Q

What is wrong with using volunteers in a survey?

A

Those who volunteer may not be like the rest of the population. An example may be, if you’re trying to find our how often people volunteer for things. So you ask for volunteers to take the survey…. A question may be “when was the last time you volunteered for something?” Well. they all just volunteered for the survey!

71
Q

How many ways can you choose 3 books to take with you on a trip out of the 7 books on the shelf?

A

7 choose 3.

7!/(3! * 4!)

notice that the two factorials on bottom add to the top.

72
Q

What is the “mean of a random variable?”

A

The expected value.

sum of probs times values

You can use calculator to find

1 var stats L1, L2

73
Q

P(THIS and THAT)

when they are not independent? How?

A

probability A times probability B (knowing A is true)

called general multiplication rule

P(A)*P(B given A)

P(this)*P(that given this)

74
Q

How is clustering and stratifying different when doing a sample?

A

In clustering you can grap one or two clusters.Clustering is when chosen at random a group from the population that looks like the population,

Stratifying you must take a few from every strata to get a representative sample. Stratifying is slicing a population into homogeneous groups(strata). Then randomly sample within each stratum before the results are combined.

75
Q

What is systematic sampling?

A

. Systematic sampling includes picking every Nth number of what you are sampling (for example people.). You must still start on a random person and then from then on take every Nth person. So you can take every 10th person in a line in order to take a survey as long as you also start on a random individual.

76
Q

How many ways can I arrange 4 letters?

A

4!

4*3*2*1 = 24 ways

77
Q

What is the area under the normal curve?

A

1 or 100%

78
Q

Use the following words in one run on sentence: inference, sample, statistic, parameter, population, census, data

A

I was curious about a population parameter, but a census was too costly, so I collected data for a sample, calculated a statistic and used that to make an inference about the parameter of interest.

79
Q

Why randomize in an experiment?

A

To avoid bias. An experimenter might want their treatment to work, so may chose the subjects that might respond best to show how great it is, when in fact, IT NO GOOD.

80
Q

Why blind the subject?

A

When people know they are getting a treatment, they may feel better even if the treatment doesn’t work. Their previous experience with the brand might bias their reporting or something..

81
Q

RAND VARIABLE:

X has mean y and standard deviation of z.

A has mean b and standard deviation c.

Find: Mean, SD and VAR of: 3X

A

mean: 3y

SD 3z

var 9z2 same as (3z)2

82
Q

How can you simulate on your calculator?

How can you create a random number table?

A

RANDINT( lowest, highest, how many you want to grab)

83
Q

What type of probability when you are looking for the first success after the fifth attempt?

A

FIRST.. GEO

not on the 4th or before

1-(fourth or before)

1 - geocdf(p, 4)

84
Q

You want to simulate the likelihood of more than 4 psychology majors being on a full bus that seats 30. 1 in 9 students are psych majors.

A

use single digits on a random number table. Each digit represents a student on the bus. Ignore the zeros. Let 1 be a psych major, and 2 through 9 be other students. Trials end when you have reached 30 students. Count the number of psych majors (ones) in the trial. Record this. Do this 20 times. Find the percent of times there were 4 or more psych majors on the “bus.” If this occured in 5 trials.. then the likelihood is 5 in 20, or 25%

85
Q

What type of probability when you are looking for exactly 5 or less successes in twelve attempts?

A

BINOMIAL P

binocdf(12, p, 5)

86
Q

How is Blocking in an Experiment Similar to Stratifying in a Sample?

A

The two are similar because they divide the subjects into groups that have similar traits.

87
Q

how do you combine probability models?

(play more than one game)

A

add or subtract the means,

and thenADD THE VARIANCES

88
Q

What is the difference between single-blind and double blind?

A

Single blinding is when all individuals in either one of the classes are blinded; double-blinded is when everyone in BOTH classes are blinded.

Classes are:

subjects and treatment givers

and evaluators

**Can’t blind a tomato plant, so blind the fertilizer guy

89
Q

What type of study would find relationship beween Verbal and Math SAT?

A

Retrospective. You could take all of the SAT Math and Verbal scores and run a regression and find the r-quared value and linear model. This would be a Retrospective Study.

90
Q

What type of probability when you are looking for exactly 5 successes in twelve attempts?

A

binopdf (12,p,5)

91
Q

How is blocking different from stratifying?

A

Blocking is in an experiment, when you want to tease out a possible confounding variable.

stratifying is in sampling when you want to make sure to get units with a specific characteristic so your sample is representative of population.

92
Q

What is the expected value?

A

The mean of the random variable.

What you’d AVERAGE if you played the game A LOT!!!!!!!!!

93
Q

Do we add or subtract st dev when combining models?

A

NEITHER!!

you always just add variances.

Square the st devs, add them, then take sqrt.

94
Q

binocdf (inputs)

A

EXACTLY X OR LESS successes in N tries (cumulative)

n: total number
p: likelihood
x: # of successes

binocdf(n,p,x)..

95
Q

what is disjoint?

A

MUTUALLY EXCLUSIVE

They can’t both happen at the same time!

(being over 5 feet and under 4 feet)

96
Q

Why does it make sense to double-blind an experiment?

A

It reduces bias in an experiment. If subjects don’t know what treatment they’re receiving, they won’t change their habits based on that knowledge. If evaluators don’t know which treatment each subject is receiving, they won’t bias the true results based on the results they expect to see

97
Q

How do you find mean and sd of discreet random variable from a table or from a game or something?

A

Make a table,

put values in L1 and probabilities in L2,

and run “1-var stats L1,L2” and you get it!

98
Q

What is probability first success is on 7th try?

A

qqqqqq p (q^6*p).

(this is a GEO prob)

99
Q

What is it called when knowing one event happened does not change the probability of another event occuring?

A

independent events

100
Q

what do we call it when events are not associated?

A

independent

101
Q

How many trials in simulation?

A

At least 20-30.

102
Q

can disjoint events be independent? EXPLAIN

A

NO.. If they are disjoint then knowing one tells you that the other couldn’t happen, so it does impact the likelihood of the other, so they are always NOT INDEPENDENT.

DISJOINT EVENTS ARE ALWAYS ASSOCIATED!!

103
Q

when can you expect the first success if there is a 30 percent chance of success?(mean of geo)

A

1/p

or

1/.30

.Which is 3.333 so around the 3rd or 4th try.

1/p tells you, on average, when the first success will occur

1/p is the mean of the geometric distribution

104
Q

binopdf(inputs)

A

EXACTLY X successes in N tries

n: total number of tries
p: prob of success
x: number of successes

binopdf(n,p,x)

.Probability of exactly X successes in N trials. (PARTICULAR probability)

105
Q

How to find P(at least 1)?

A

1-P(none)

106
Q

What is difference between subject and experimental unit?

A

Humans who are experimented on are commonly called subjects in an experiment. Subjects like dogs, days, plants and anything not human are called Experimental Units

107
Q

Probability of THIS AND THAT? (when indep)

A

Multiply

P(this)*P(that)

works when independent only,

when there is an association, then P(that) should be p (that|this), so it looks like this:

P(this) * P(that given this)

108
Q

What’s the difference between cluster and stratified?

A

Stratified- you grab a bit from each strata… you divide the population up into groups according to traits, called strata (groups with similar traits- homogeneous groups) and randomly choose from each strata to get a representative sample.

Cluster- grab a cluster or two, . each cluster should be like the population. You don’t neet to take a little from each cluster, they are already representative.

109
Q

What is the purpose of matching?

A

Matching (a type of blocking), reduces unwanted variation. In a retrospective or prospective study, subjects who are similar in ways not under study may be matched and then compared with each other on the variables of intrest.

110
Q

What is the sure way to assign treatments correctly?

A

assign random number then sort low to high and start with bottom..

or

throw names in hat and pick.

111
Q

What is undercoverage?

A

Undercoverage is when either one part of the population is not included in a survey or is underrepresented in the survey

112
Q

What is random sampling?

A

When we use chance to select a sample.

You MUST use some real randomness

ex: dice, cards, randint, number table

113
Q

What do we call it when events are not independent?

A

associated

114
Q

When would you use two digits instead of a single on a random number table?

A

When the percent is not a multiple of ten, Like “18% ofdogs eat underwear”.. You’ll have to assign 01-18, or 00-17 as undie eating dogs.

115
Q

Why do you have to Stratify?

A

You don’t have to.. But you might want to if you feel that a simple random sample might not be representative of the population . You want your sample to be like the population. a representative sample (it represents the population well).

116
Q

What is a mutlistage sample?

A

A sample that combines several sampling methods,

like stratifying then clustering…

117
Q

What is statistically significant?

A

When an observed difference is too large for us to believe that it is likely to have occurred naturally (or just randomly).

Basically it is Statistically Significant when we don’t think it happened randomly

We use 5% as a threshold. If it was less than 5% likely to happen, then that is significantish.

118
Q

probability this AND that . Add or multiply?

A

MULTIPLY

119
Q

Samplin Method Types?

A

SRS, stratified, clustered, systematic, multistage, convenience, voluntary

120
Q

what is that (n over k) thing in the binomial equation?

A

n choose k

it tells you how many ways you can choose k objects from a set of n things. The formula is

n!/(n!(n-k)!)

the two numbers on bottom add to the number up top. These are coefficients in expanded binomials and can also be found in Pascal’s Triangle

121
Q

What’s a useful alternative when you can’t run an experiment?

What are they useful forms of this, and how do you preform them respectively?

A

An alternative of an experiments could be an observational study. There’s two forms: prospective and retrospective. A prospective observational study is when you identify subjects in advance and record data as you go along. A retrospective observational study is when you analyze observations from the past.

122
Q

What is more important, percent of population or size of sample?

A

Sample size.

A sample of 150 will say as much about a population of 2,000 as it will about a population of 2,000,000.

the percent of the population isn’t what matters.

The sample size determines level of confidence and interval widths..

123
Q

What is sampling error?

A

IT IS NOT A MISTAKE!!!…

Because the data in samples are generally different, the statistics calculated from one sample to another vary and are generally not equal to the parameter. This variablilty of the STATISTICS is called sampling error.

(not the variability of the data).

124
Q

geocdf (inputs)

A

FIRST SUCCESS ON OR BEFORE

p: probability of succes
x: xth try

geocdf(p,x).

Probability of the FIRST SUCCESS being ON OR BEFORE the Xth trial.

125
Q

what is a simulation?

A

Basically a test based on reality with a sequence of random outcomes that model it.

Like an imitation.

126
Q

What’s the difference between a prospective and a retrospective study?

A

A retrospective study takes a group and looks back at its history while a prospective study watches a group for a period of time and records the data into the future. RETRO-REVERSE, PROspective- PResent and On..

127
Q

RAND VARIABLE:

X has mean y and standard deviation of z.

A has mean b and standard deviation c.

Find: Mean, SD and VAR of: 3X + 5A + 12

A

mean: 3y+5b+12
sd: sqrt (9z2 +25c2)

var 9z2+25c2 same as (3z)2 + (5c)2

128
Q

what is representative?

A

It means that the sample statistics will be kind of like the population parameters.. The sample “looks like” the population.

129
Q

What is a big difference between subjects in experiments and members of a sample you got from one of the sampling methods?

A

In experiments you don’t need a representative sample of the population, you can have volunteers, convenient subjects and that is OK. You are looking at impact of treatment, not at getting a representative sample.

When you use one of the sampling methods, you want a sample that looks like the population so you can make an inference about the population.

130
Q

How can you estimate the probability of an event occurring using a random number table or randint?

A

Run a simulation.

Find the percent of trials that you observed the event occur.

131
Q

Probability this OR that

when they are not disjoint?

How?

A

probability A plus probability B minus the double counted (the ones that are both A and B)

called “general addition rule”

P(A)+P(B)-P(A and B)

P(this)+P(that)-P(this and that)

132
Q

What is the difference between response bias and nonresponse bias?

A

Response is when the person’s response is influenced by the question or questioning method (like if a parent asks if you use drugs, as opposed to a friend… there is only one answer to this, but one might respond differently to them),

non response is is when the people who don’t respond might have different opinions/views than the people who did.

133
Q

What is Placebo used for?

A

Placebo is used for control in an experiment.

It lets you know how factors other than the treatment impact the subjects.

the purpose of placebo is to determine the change between the controlled treatment and the other treatments

134
Q

To make a survey to tell of a restaurant is good, would you ask the people coming out of the restaurant?

A

People at the restaurant are probably there because they already like it. If you asked the question “Is this your first time dining here?” and if they say “yes” you survey them, that would be a better method. But then again.. the people wouldn’t go into an Italian restaurant if they didn’t like that type of food.

135
Q

Is it always better to do a census or a sample?

A

It depends generally, it is better to do a sample since a census is expensive to execute, and because popultaions are always changing it is hardly more accurate then a sample.

BUT, For small populations, a census is fine. Ordering sandwiches for your family, do a census.

136
Q

what does geometric model tell us about?

A

it is about FIRST SUCCESS

What is likelihood first success is on 5th trial?

q q q q p

137
Q

What are the 3 ways we used random numbers?

A
  1. To simulate the likelihood of an event occurring. (ch 11)
  2. To choose a sample that is representative of the population and avoid bias.(Ch 12)
  3. To assign subjects (experimental units) to treatments to evenly distribute variability and help reduce possible confounding variables.(Ch 13)
138
Q

Who can be blinded?

A

Subjects.

Those delivering treatments.

Those assessing effectiveness of treatments.

and three mice.

139
Q

what is pythagorean theorem of stats?

A

st dev of combined model is:

sqrt(st dev squared + st dev squared)

or more if you combine more

140
Q

How are the good sampling methods similar?

A

In all of them, all members of population have equal chance of being selected.

So.. individuals have equal chance in them all, but there are impossible sample groups for some.

141
Q

What is “mutually exclusive?”

A

same as disjoint

142
Q

What is a level in an experiment?

A

A level is a specific value(s) that the experimenter chose for a factor that is manipulated.

ex. Factor is sleep, level(s) would be how many hours the subjects were aloud to sleep. 4 hours, 6 hours, 8 hours. 3 levels

143
Q

Probability of THIS OR THAT? add or sub?

A

ADD

P(this) + P(that)

works when disjoint only, when not, subtract overlap.

144
Q

Give three examples of disjoint events

A
  1. A card being a CLUB and a RED
  2. A student being a SENIOR and a FRESHMAN
  3. An animal being a CAT and a GOLDFISH(both can’t be true)
145
Q

When can you use single digits for simulations on a random number table?

A

When the percent is a multiple of ten, like “30% of teachers secretly twerk”, then you would assign 1-3 or 0-2 as twerking teachers.

146
Q

How can we use Pascal’s Triangle?

A

To find probability of x successes in K trials..

BINOMIAL BABY!!!

147
Q

What is the placebo effect?

A

When those who get the placebo show improvements, or show the effects of the treatment.

This often happens to up 20% of participants!

148
Q

Why blind those doing the analysis?

A

Researchers like to see results, they want to see an effect. If they know which treatment is the actual medicine, then they might be “looking” for it..

We want the data to say it works, not the person.

149
Q

Example of how not blocking would backfire

A

Deoderant.. If you just randomly assign it, maybe the active people get deoderant X and non active get Y. The results would be confounded by lifestyle. Was it deoderant Y or the fact that the people didn’t sweat all day? You want people in each group to get both deoderants.

Leather preserver.. If you randomly choose from all chairs in an airport for treatment and brand A randomly has a lot of chairs near the sun, Brand B randomly gets a lot fo chairs near the main entrance and Brand C randomly gets the chairs that don?t have a lot of sun, or a lot of use, you may think that brand C works the best, when in fact, the results were confounded by sunlight and usage..

150
Q

What is the difference between confounding and lurking?

A

Confounding is with experiments, it is the thing that may be causing the different effects instead of the treatment (sunlight instead of leather preserver). Lurking is with regression, it is when something is causing things to go up and down together like how the weather impacts ice cream sales and beach injuries (rise and fall when more people are at the beach).