lecture 9 Flashcards
ways statistical tools can fool us
framing (equivalent ways of advertising the same product (80% lean vs 20% fat))
Misleading questions (non-equivalent ways of asking the samy type of question “do you believe that you should be taxed so other citizens dont have to work? vs. do you helieve that the government should help unemployed people
Non-sensical data visualizations (percentages do not sum up to 100%, axes are truncated (cherry picking), scales miss labels about what they represent
What is statistics good for despite its misuses
description (e.g. proportion of surveyed people who ar happy)
Estimation (e.g. winner of an election based on polls)
Generalization (e.g. how many people are likely to find a job based on the number of people who have already found one
Hypothesis testing (e.g. null hypothesis significance testing, bayesian stats,…)
Hypothesis testing. The NHST way
1) formulate a null hypothesis (e.g. this treatment is not effective or there are no correlation between the variables or this person has no special ability)
2) develop expectations in the form of probability distributions for possible outcomes assuming the truth of the hypothesis
E.g. If this treatment is not effective, then wehn I run an experiment, such and such differences between control adn treatment groups will be observed
3) Collect data/observations and evaluate to what degree the data violate expectations based on the hypotheses
4) draw inference from this comparison
E.g. the data are inconsistent with hypothesis that the treatment makes no difference. So, there is reason to reject the null hypothesis
The (deductive) logic of NHST
statistical version of the hypothetico deductive method
Null Hyp (p) leads to expect acertain range of possible outcomes/data (if p, then q)
When the observed data are far outside that range (not-q), we can reason that they would be very unlikely if the null were true. So, the data give us grounds for rejecting the null (not-p)
Example in class null hypothesis testing
jun and the beer challenge
Null hyp leads to expect jun is right 4/8 times
Significance level
Decision about how improbable, given the truth of the null, an observed result must be to warrant rejecting the null
Determines the probability p that we would obtain the observed data by luck (e.g. due to random sampling error) if Jun had no real capacity (i.e., a type 1 error)
How surprising/improbable should an outcome be to be considered significant
No unique, uncontroversial answer
largely dependent on convention and background knowledge
E.g. In the social sciences (e.g. signing at the beginning of a compliance form decreases dishonesty) the significance level is typically 0.05
In particle physics (e.g. the higgs boson produced the particles detected) significance level can be 3*10^-7 or 1 in 3.5 million
p value
probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is true
Basic idea: an index of how incompatible observed data are with a statistical hypothesis
The smaller the p-value, the more surprising the data are, assuming that the null hypothesis is true
Whether or not to reject the null is determined by
comparing the p value and the significance level
If P-value is less than or equal to significance level, then reject null hypothesis
If p-value is greater than significance level then you cant rule out the null hypothesis
Two challenges for NHST
failure to reject the nulll does not give you reason to believe that the null is true
Recall the logical form of the HD method
IF H, then O
Not O
therefore not H (VALID)
IF H then O
O
Therefore H (INVALID)
2) choice of significance level determines the degree to which we should be willing to accept different kinds of errors, and it is non-trivial to assess the gravity of making one type of error compared to the other
Type 1 error
false positive
Erroneously rejecting the null
Type 2 error
False negative
Errorneously failing to reject null hyp
Lower sig level
Indicate you require stronger evidence before you will reject the null
Reduces the chance of type 1 (false positive) errors but increases chance of type 2 error
E.g. Reduced change that an ineffective product is mistakenly accepted as working property; but increased change that an effective product is mistakenly reject as not working