1. Aug 20 Flashcards
First
Things we need to understand well
- Hypothesis testing and why we use it
- What a normal distribution is and why we use it
- What a p value is and how it is used
- Null hypothesis how is it used
- Regression, what’s it used for, how do we do it
- Sum of squared error
- Sum of square due to
- R squared
- Confidence interval
- t test
- ANOVA, what it is, how to do it
Do we need to worry about every bit of theory?
No. This is practical. Computers run data, not us.
Applied analysis
Some theory needed, but more about READING DATA, analyzing data
Lab
A place to practice analyzing data and I will help you
Purpose of this class two fold
- Teach statistical methods
2. Teach you how to use R
Why do I use R
- Have experience in SAS, JUMP, sy, SPSS
- I like R because it’s powerful and FREE
- SAS requires a license, at least $1k
- One of fastest growing statistics programs
- You have to type out commands
- You have to be explicit about what you want
- We’ll learn 150-200 lines of code
- Buy “The R Book”
- New one is green
- Encyclopedia of coding
Positive way of interacting with grad students
“You are grad students. You have stuff to do. I assume your competence. I assume you will talk to me if you need help.”
The final
“This test will not be easy if you aren’t paying attention. This test will require preparation for those with poor study skills. It will require moderate to easy for those with high quality study and learning skills”
Johnson 1999 link
All the things you learned about statistics that are WRONG
It is very important, please read it.
Firm believer in…
The best way to learn something is to teach it to somebody else
Definition for science/scientist
- Training to be a scientist, what is that?
- “searches for the truth through the accumulation of facts”
What is the truth
- Truth: the way things really are (“But there might be one truth for one person, and one for another” - Absolutely not)
- It’s what we want to know
- BUT we can never know it
What are facts?
- Measurements of truth
- This is where measurement/sampling come in
2 Reasons We Can’t Know Truth - Error
- Process error (a tad more complicated) - There are many things influencing truth
- Ex: How much taller boys are than girls
- If it were JUST about gender, ok, but it’s NOT. There are OTHER variables
- Other variables: genetics, nutrition, location)
- Other sources of variation make it impossible to ever know truth (ESPECIALLY in ecology, there’s just SO MANY THINGS)
- Ask: what would we need to make the PERFECT measurement of height for people on campus (what if we left out the basket ball players, gymnasts)
- We cannot get the PERFECT measurement without measuring everybody, which is impossible - Sampling error (we can’t measure anything perfectly)
- There is always variation in measurements
What are statistics
Statistics is the method for estimating truth from facts
- He believes: you should not be a scientist without strong statistics.
- You cannot do science without knowing statistics. It is the MEASURE of truth
- Most statistics classes teach you to calculate p values, which isn’t even the most important piece of information
- The MOST important things about statistics are the things that tell you the TRUTH
- We don’t need to know “Is there gravity?”, we want to know “What’s the chance of gravity crashing our space craft?”