Module 1 Flashcards
What is a population? What does the population contain?
- It’s everything you care about
- Could be a group of existing individuals/objects
- could be a hypothetical and potentially infinite group of individuals/objects
The populations contains the TRUTH
What is a sample?
- It’s a subset of everything you care about
- We make inferences about the population from the sample because it is often impossible to measure everything about a population
What two things should samples be?
- Random
- Representative(i.e. accurately portray the distribution of the population
We use samples to __________(1) about populations. Assuming you know everything about the population, then ______(2) will help us understand something about the sample.
(1) make inferences
(2) probability
What are the three steps of the scientific method?
- Define prior beliefs: These are competing hypotheses(H0, H1, H2, etc.) about the state of the universe(i.e., “population”). Before collecting any data, what do I believe about the world?
- Experiment: Design an experiment to generate objective data(i.e. take a SAMPLE from the population) to learn about the state of the universe.
- Use evidence to update prior beliefs: The data will support some hypotheses more than others. Make claims about the population based on the sample. Depending on the strength of the prior belief, you may need more or less compelling data
What is the likelihood function? What is the notation?
The likelihood function gives the probability of the observed outcome for a particular value of the unknown truth; It is the measure of the quantitive evidence about that truth.
Ex. What is the probability of getting 3 heads if the coin has 0 heads –> the answer is 0
NOTATION:
P(HHH I H1 is true) = —
What is the probability of 3 heads if H1 is true?
The likelihood function helps us evaluate the probability of our hypotheses now that we have collected some data
Evidence is always ______(1): It supports one hypothesis ______(1) to another. As such, ______(1) likelihood is more important. Evidence is used to update _______________(2).
(1) Relative
(2) Prior beliefs
When drawing a “likelihood of experimental result” graph, what are the x and y axis labels. Do you connect the data points with a line?
x axis: Hypothesized truth
- Label the axis as H0, H1, H2
Ex. Hypothesized truth: # of heads on coin
y axis: Likelihood of data
- Choose ONE outcome and use appropriate notation
- use decimals form 0.00-1.00
Ex. Likelihood of data: P(HHH)
You do not connect the data points with the line
What information can you take away from a “likelihood of experimental result” graph?
Based on the hypothesis that has the highest probability(i.e. the probability that maximizes our likelihood), we can determine the unknown truth of the world that would make the data that we observed most likely to occur. It doesn’t mean its for sure what happened, but the data would most likely occur if this hypoehts is true.
What is the formula for Bayes theorem?
Posterior odds = (prior odds) x (likelihood ratio)
P(H2 I Data) P(H2) P(Data I H2)
—————— = ——— X ——————
P(H1 I Data) P(H1) P(Data I H1)
What are odds? What do odds of greater than or less than one indicate about the events?
- Odds are a way to express the likelihood that an event occurs
- If odds>1 then the top event(H2) is more likely
- If odds<1, the the bottom event(H1) is more likely
How can you convert from odds to probability?
Prob = (odds)/(1+odds)
What are the two types of probability?
- FREQUENTIST: long term frequency
- Consider events over a long period of time and based on the frequency of their occurrence, we can infer something about the probability of each event.
- Repeat same experiment over and over and look at the proportion of times the event happens
- Examples: Coin tosses(probability coin lands “heads”), disease prevalence(probability a randomly selected person has the disease) - BAYESIAN(subjectivist): measure of personal belief
- Does not mean that its not informed by data
- Start by defining prior belief(established without any information to back it up) and then use the observed data to update our prior belief to form our posterior belief.
How do you calculate P(A and B)?
P(A and B) = P(A) x P(B)
How do you calculate P(A or B)?
P(A) + P(B) (assuming the events are mutually exclusive)
What does it means if two events are mutually exclusive?
It means hat the two events cannot occur a the same time.
What does it mean if two events are independent?
Events are independent if knowing whether one occurred tells you nothing about whether the other one occurred
Probability assigns ________(1) to any set of possible events(outcomes) of an experiment.
(1) a number in [0,1]
What is P(11) in the same of the faces of a toss of two fair, standard dice?
Possible outcomes: (1, 2, 3, 4, 5, 6) X (1, 2, 3, 4, 5, 6) = 36 posible outcomes
Based on the physical model of the mechanic, there is a 1/36 chance of getting a particular combination
P(11) = P(5,6) OR (P6,5) = (1/36) + (1/36) = 1/18
We can use physical models to determine probability. Give an example of how this could apply to a coin toss?
When we toss a (fair) coin, there are two equally likely possible outcomes: heads(H) or tails(T)
By the physical model of the mechanism(a fair coin):
- Each of the two outcomes is equally likely, so P(H) = P(T)
- One of those outcomes must occur, so P(H) + P(T) = 1
- Therefore, P(H) = P(T) = 0.5
What is a random variable?
A random variable is a numeric function of the outcomes of an experiment
- Ex. Flip a coin 5 times. Let X=number of heads
What is a discrete probability function?
A discrete probability function describes the probabilities associated with each possible value of the discrete random variable.
You make a chart with the first row having all the possible values of the discrete random variable . The second row contains the probability that the random variable equals each of those possible values.
A random variable is discrete if _______.
it can only assume a countable number of possible values
What are joint probability distributions?
- Join probability distributions describe how the outcomes of two experiments behave together(considering two experiments at the same time)
- We summarize joint behaviour in a two-way contingency table
What is a joint probability?
The probability of 2 outcomes from 2 different experiments occurring at the same time
- Always dividing by the total number of people in the whole experiment
What is a marginal probability?
Consider only one of the outcomes, the other is not known. looking at only one probability in a contingency table
- Always dividing by the total number of people in the whole experiment
What is a conditional probability? What is the formula for calculating conditional probability?
We want to know the probability of one outcome in one experiment given the outcome of the other experiment.
P(A I B) = P(A and B) / P(B)
Note: to calculate P(A and B) here, dont multiple the probabilities, just determine the value from the table
What is relative risk/risk ratio? How would you calculate the relative risk of A compared to B?
Allows us to compare conditional probabilities.
To calculate the relative risk of A compared to B = risk of /Risk of B