Exam 4 Lecture 2 Flashcards
Law: Innocent until proven guilty (or not guilty)
‘Burden of proof’ in law is on sufficient evidence
You have been charged with robbing a store
- There is some evidence that you did it (If there was no evidence, you wouldn’t be charged)
- There is some evidence that you did not do it (If the evidence was strong, you would plea.)
Only you know the truth
What does the jury need to decide?
- Is there ENOUGH evidence that you did it? (proof of guilt)
- If yes, you are found guilty (the evidence in favor of guilt was sufficiently strong)
- If no, you are found not guilty (the evidence of guilt was NOT sufficiently strong, aka it was null)
You may BE innocent but you cannot be never FOUND innocent
Null definition
Having no value
It’s a battle of evidence. Not guilty is the ______ hypothesis. Guilty is the ________ hypothesis.
Not guilty is the null hypothesis (H0)
Guilty is the alternative hypothesis (H1)
H0 is reigning champion. H1 is the challenger.
Not enough evidence? You cannot reject the null (H0 wins = not guilty)
Lots of evidence? You reject the null (H0 loses = guilty)
Why ‘not guilty’ vs ‘innocent’? The question is GUILTY: Yes or No
Not rejecting the null hypothesis= insufficient evidence that you robbed the store
Rejecting the null hypothesis= sufficient evidence that you robbed the store
Yeah, you see it, but do you believe it?
“Burden of proof” in stats is on finding a real difference
You do a fancy experiment and you find some fancy results. How do you know that the results are REAl (that is, that if you did the experiment again, you’d get the same results).
It’s not real is the null hypothesis (H0).
It’s real is the alternative hypothesis (H1).
You assume that something is null until you have evidence it is real.
Sampling is everything
Statistics are estimates that are derived from a sample that is thought to represent a population.
You are asking a question about a population: What is the most popular car color in the world?
So, you get a sample.
Can we calculate the likelihood that the sample is representative?
- Sample size
- Sample characteristics
OK, so to be ‘significantly different’ or ‘significantly changed’ or ‘significantly related’…
You must be reasonably sure that the results are real.
Statistical Significance
Statistical significance quantifies the likelihood that your result is due to chance versus ‘real’.
- Sometimes a result is just because of sampling error-> wrong sample/wrong sample size, population characteristics
- Sometimes a result is because your experiment worked, yay!-> You tried to change something, and it changed! You tried to predict something, and it was predictive!
- Your big picture ‘result’; usually denoted as a p-value.
What you want to know…
When you perform an experiment, you are asking a specific question:
Does exercise change healthiness of diet?
- The answer you are looking for is YES, exercise changes healthy eating
- The null is NO, exercise does NOT change healthy eating
Are avid exercisers fitter than sedentary people?
- The answer you are looking for is YES, avid exercisers are fitter than non-exercisers.
- The null is NO, avid exercisers are not fitter than non-exercisers.
What you want to know vs what you can actually find out:
Does exercise change healthiness of diet?
The answer you are looking for is YES, exercise changes healthy eating. But the best you can say is from a single experiment: EVIDENCE THAT exercise changes healthy eating was PRESENT in THIS SAMPLE.
The null is NO, exercise does NOT change healthy eating. But the best you can say is:
EVIDENCE THAT exercise changes healthy eating is ABSENT in THIS SAMPLE.
If you repeated the experiment in a different sample, would you find the same thing?
Statistics is an applied math
We must quantify:
- strength of evidence
- level of confidence in results
Statistics uses probability
Statistics are estimates that are derived from a sample that is thought to represent a population.
Since you don’t know everything, you can never be 100% sure!
So, what level of ‘sure’ are you?
And, how sure do you need to be to be confident?
Statistics uses probability to quantify our confidence.
What does the “p” in p-value stand for?
Probability
Probability, p-value
A value that reflects how likely something is to occur.
A calculation that quantifies what you mean by ‘probably’
A mathematical tool; it is a number (0 to 1)
0 = never happens
1 = always happens
.9 = happens 90% of the time
It reflects our level of uncertainty.
- It’ll probably rain today.
- I doubt you can bench press 100 lbs.
- Chances are that you can run faster than me.
The “p” in p-value= probability
To say something is “significant”
- The fitness groups were significantly different from one another
- The sample was significantly changed by the medication
- The class grades were significantly related to study time
- Nicotine vapes significantly increase your odds of getting lung cancer
The p-value reflects our level of uncertainty. But what, exactly, are we certain/uncertain of?
To be significant, you are sating an observed difference in a sample is not just due to chance.
Probably, it cuts both ways
- I found significant differences, and it’s really there (Exercise improves health!)
- I didn’t find significant difference, and it’s really not there (A Burger King diet does not improve health!)
But what if…
- I found significant differences, and it’s NOT really there (A glass of red wine a day improves health?)
- I didn’t find significant differences, and it really IS there (Taking the stairs, not the elevator, improves health?)
Probably, it cuts both ways.
A probability of .9 means 90% confidence- in what?
- I found significant different differences, and it’s really there-> True Positive
- I didn’t find significant differences, and it’s really not there-> True Negative
And what’s the other 10%?
- I found significant differences, and it’s NOT really there- Type 1 error= False-Positive
- I didn’t find significant differences, and it really IS there- Type 2 error= False-Negative