Exam 2: Inferential Statistics Flashcards
p value
Probability that your observed result (or a more extreme one) came from the distribution of your null hypothesis
p value = the probability that was is pulled from this unlikely scenario
p = 0.06 or p=.1 marginal affects
how extreme is this value- how likely was this by chance?
One-Tail or Two-Tail?
include if its one tailed or two tailed in the manuscript
split the criteria across the two tails in a two tail case
• Only use one-tail hypothesis testing is if obtaining a result in the other direction is IMPOSSIBLE or UNINTERPRETABLE • Some might say one-tail is OK if you have a directional hypothesis, but, why might that be a bad idea?
if you happen to find a result in the opposite direction you can’t interpret it
- Examples where one-tail is required or appropriate*
- Implicitly, Chi-square and F-tests (only positive values; these results tell you that values differ, but not in which direction)
- If you consider the consequences of missing a potentially large effect in the untested direction and conclude that they are negligible and in no way irresponsible or unethical
Statistical significance
“Significance” is with respect to your pre-determined alpha. Significance ≠ Importance. DON’T say results are “more/less significant” just like you wouldn’t say “Bill
passed the exam more than Stacy” < – For this sort of idea, we often use “effect size”. Importance = more of a theoretical idea. You can say the p value is different or the effect size is different.
Type 1 error
Your research tells you to reject the null/alternative is true, but in reality the null is true.
Boy who cried wolf: FIRST error was that he cried that there was a wolf but in reality there was no wolf (people believed him)
can occur because of sampling issues, when participants in the study did not well represent the population
Type 2 error
Your research tells you to accept the null and reject alternative but in reality the alternative hypothesis was true.
Boy who cried wold: the SECOND error was that people believed he was lying (accepted the null) and there was a wolf.
Might happen if the study lacks sufficient power; to increase power, recruit an appropriate number of participants
Power
When you research tells you that the alternative hypothesis is true and in reality the alternative hypothesis IS TRUE.
Balance between error types
Where do you set your criterion?
- Medical tests: A low criterion ensures you don’t miss abnormalities, but if too low, people may have unnecessary interventions
- Law: If we wait until we’re sure beyond any possible doubt, then we won’t wrongfully convict anyone; but we might let a criminal go free.
This is why we’re obsessed with p < .05
• It makes us feel like there’s one true criterion
(see: Bayesian statistics)
Multiple Comparisons
- If each test has a 5% (independent) chance of being a false positive…
- Actual error rate across tests (technically family-wise error rate) = 1 - (1 - α) ^ (#comparisons)
• In the cartoon,
each test has a chance of being 5% false- do twenty test..
Error rate = 1-.95^(20) =1-.36=.64
Bonferroni Correction
• Divide the p-value criterion by the number of tests you perform: In the cartoon’s color tests (there were 20 tests of jelly beans): .05/20
need p < .0025
- Often overly conservative
- Other approaches differentially balance α and β
Measures of Association
- Typically Non-Experimental Designs
- Correlation
- Pearson
- Spearman Rank-Ordered
- Chi-Square
- Regression
Pearson Correlation
• Relationship between two interval/ratio measures
• Ranges from -1 to +1
• Effect size: for each unit increase x, how much unit increase in y
• 0.25 ~ weak
• 0.50 ~ moderate
• 0.75 ~ strong
• Not everyone agrees on the exact numbers!
+1 or -1 are graphed as perfect lines at 45 degrees
+/- 1 is a perfect correlation
closer to +/- 1 = stronger relationship
• r is the strength of a relationship in your sample (APA)
• Does not say whether that relation is meaningful or not
-R^2 = Coefficient of Determination
*** r does not say whether that relation generalizes to the population: Statistical significance, Criterion r for a certain p-value, Degrees of freedom (df) = # pairs of observations – 2
Coefficient of Determination (R^2)
-R^2 = Coefficient of Determination
• Square of r
• Proportion of variance accounted for
example-
• r = .50
• R^2 = .25
- Weight only accounts for 25% of the variance in height 70
- Other factors may account for much more
Example Pearson’s Correlation and Coefficient of Determination
A study observed a correlation between children’s working memory and benefit from CI use
• Digit span test of working memory
• Word identification performance
• r = 0.41
• Indicates that 17% of the variability in performance was
accounted for by differences in working memory
• Demographics (age at test, age at implantation, duration of use) accounted for an additional 30% of the variability
Degrees of Freedom
how many points are free to vary
• For correlations, df = n pairs - 2
• To preview group differences:
- For one group, df = number of observations - 1
- For two groups, df = n - 1 for each group –> n1 + n2 - 2
• df and αlpha determine the statistical test value (r, t, z, etc.) needed for significance.
- If you have 4 participants, and their M = 10
- What is the 1st person’s score? You don’t know– it’s free to vary.
• Assume the 1st score was 5. What’s the 2nd score?
You don’t know that, either – it’s also free to vary.
• Assume the 2nd score was 7. What is the 3rd score?
You don’t know that, either – it’s also free to vary.
• Assume the 3rd score was 15. What is the 4th score?
This one you do know. If the average is 10, the fourth score must be 13. So it is NOT free to vary.
Restricted ranges
not as likely to have a strong correlation
Spearman Rank-Order Correlation
• Strength/direction of association between TWO ranked (ordinal) variables
• Non-parametric version of Pearson r
— Data not required to fit a normal
distribution; Less sensitive to outliers —
• ρ or r (sub) s
Chi-Square and Contingency Coefficient Tables
• Association between NOMINAL variables
- Statistic (chi-square)
- X^2
- df = (rows – 1) * (columns – 1)
wont be tested on the formulas of these things
just understand what the test is trying to accomplish
Contingency Coefficient
- Magnitude of association (contingency coefficient)
- C
- Ranges from 0 - 1
wont be tested on the formulas of these things
just understand what the test is trying to accomplish
Regression
- Predictive value of association
- Simple Regression = R^2
- How much variance is accounted for?
- “best fit” line: y = mx + b
- DV = slope * IV + intercept