4. Aug. 27th Flashcards
Summary of last class
- How does regression calculate a p value?
- – partitioning y into two parts (SSR, SSE)
- Statistical equation for a line
- – AKA the general linear model
Yi = B0 + B1x
B0 = intercept
B1 = slope (how much y changes for each unit change in x)
sigma = standard deviation in error
Assumption: data forms a bell curve around the line
— Size of bell curve is sigma (66% within 1 standard deviation, 95 within 2, 99 within 3) (central limit theorem)
R^2 and p-values
R^2 and p value are related
Deer feed on your property - feed to deer size
Small slope, high r^2
High slope, small r^2
—Which would you prefer?
— With high R^2, if you increase food by a certain amount, you KNOW what increase in size you’ll get (highly predictable)
— With low R^2, we can’t be sure if implementing the change will ACTUALLY result in increased deer size
A good way to report results
You only need two sentences. Reuse in all your publications.
- 1 for continuous variables
- 1 for categorical variables
For each 1 [unit] increase in [x], we observed a [slope]unit [increase/decrease] in [y] (p = [p-value]; r^2 = [r^2]).
Continuous: For each 1 [unit] increase in [X], we observed a slope[Beta-1] [unit] increase/decrease in [Y].
Ex: “For each 1 cm increase in rainfall, we observed a 3.03 kg per hectacre (+/- 0.22) increase in biomass (R^2 = 0.96, p < 2.2e^-16).”
- Rule of thumb: take to two significant digits (ex 3.03)
Confidence in R
confint(results)
- These are actually confidence LIMITS, not interval
If you repeat an experiment 1k times
- Slightly different set of data each time
- We’d get a histogram (ex. estimates normally distributed around 3)
Definition of confidence interval
“This says the true mean of ALL men (if we could measure all their heights) is likely to be between 168.8cm and 181.2cm.
But it might not be!
The “95%” says that 95% of experiments like we just did will include the true mean, but 5% won’t.
So there is a 1-in-20 chance (5%) that our Confidence Interval does NOT include the true mean.”
https: //www.mathsisfun.com/data/confidence-interval.html
- –
Confidence interval: 95% of all such intervals contain truth (not very satisfying, but technically correct)
- There’s a 95% chance that truth is in the interval
- Our best estimate of rainfall is 3.02, and we’re 95% sure the true value is between 2.80 and 3.25
Confidence LIMITS: the boundaries between which
+/- 95% CI
Why do we report confidence intervals? Because journals are ANAL about saving space.
4 Pieces of Info to Report Anytime Reporting Statistical Results
- Estimated effects (estimate)
- Confidence interval (some measure of precision)
- R^2
- P-value
P Values and Confidence Intervals
Are related, and you can determine one with another
- Used very differently
- P-values: used for null hypothesis testing (is it significant or not)
- Confidence intervals: a measure of precision in your estimate
To say when negative correlation
Rainfall went up, biomass went down
- This creates a negative number
- DON’T ever put in the negative (-) number
- Just say “decrease”
What extra sentence to add if p-value >0.05?
You may need to scale your results P value is > 0.05 - If your results are NON significant - STILL use that above sentence. - I would add, "However, our results are not statistically significant."
How do you determine if something is biologically significant?
Inevitably, it’s a subjective assessment.
- Doesn’t go in results, but DOES go in your conclusion.
Scaling small betas
How much a lotus flower size changes with a meter increase in elevation
- A meter isn’t much
- Beta of 0.00098 cm increase in lotus flower size for each meter
- When you have really small Betas, sometimes you need to scale them
- – Anytime you scale a number, you multiply the X-change by it, the Beta by it, and the CI (confidence interval) by it,
- – DO NOT change p-value or R^2
Scale from 1 meter to 1 km (1000m)
- For each thousand meter change, we observed a 0.98 cm increase in lotus flower size.
Summary: Key to reporting results
Statistics results have real meaning, and you have to communicate that. You can do so simply, in just one sentence.