Midterm 1 Flashcards
What are the four warnings in creating a histogram?
Choice of bin size has big affect
Changing axis range
Burying explanatory factors
How data is scaled
How are strip charts better than histograms?
Better for comparing multiple data series
What is the second step in sizing up data
Calculate Numerical descriptors Mean Median Mode Quantiles Variance Standard deviation Min and max
What are boxplots?
Graphical form of the quantiles
What does the line inside a box in a boxplot represent?
Line indicates the median
What does the box hold in a boxplot?
Box holds 50% of points
What are the whiskers in a boxplot?
The whiskers hold remaining points
What is a null hypothesis
Conservative statement saying that there isn’t an expected effect
What is a p-value
A measure of the strength of the evidence against a null hypothesis
How do we find the confidence level?
(1-p) x 100
What is sums of squares and how is it measured?
SSY is how we measure variability
Sum of each value minus grand mean squared
What is the relationship between SSY and n
SSY always increases with n
How to find variance
SSY/ n-1
what is the equation for standard variation
Square root of variance
What are the three variability arising in a data set?
Variability of the population (sigma)?
Variability by the sample (s)
Variability of the estimated mean
Why is standard error of the mean important
SEM can give us confidence intervals for our estimate of the population mean
How to find confidence intervals
Mean-tcrit (SD/ square root of sample size)
Relationship between estimate range and confidence level
A wider estimate range gives you a high confidence level
What is ANOVA
Analyze the difference among group means. Compare differences in values between treatments to the variation within a treatment group
What is the response variable
A continuous variable that is being influenced
What is a explanatory variable
Categorical or continuous variable that influences
In ANOVA, how do you find the total mean square
SSY/ df
What is linear regression
Can the value of he response variable(x) be predicted by the explanatory variable
Differences between ANOVA and regression?
ANOVA: discrete x values, values are names, values are unordered
Regression: continuously varying, values have number meaning, values are ordered
What is statistical elimination?
Including the second extra lavatory variable allowed us to eliminate its influence in the rest of our model
What are the four principles of experimental design?
Replication
Randomization
Blocking
Orthonogonality
What is replication
Multiple measures of the same thing
Appears in the # of error degrees of freedom(residuals)
Have at least 10 df for error
What is randomization
Treatments need to be applied to experimental units randomly
Use uniformly distributed random numbers
What cardinals sins does randomization avoid
Systemic design: similarity between plots that undermines replication
Unconscious bias in assigning treatment groups
Using haphazard bs random design
What is blocking?
Tool to minimize error variation
Distribute individual data points into different “blocks” yo minimize biases due to known common features of subsets of the points
Acts as another explanatory variable
What are the rules for block design
Blocks used to account for a factor that could influence response
Blocks should be used as internally homogeneous as possible
If possible, all treatments should be included in all blocks
What is Latin square design
2 way blocking. Blocking so that each treatment appears exactly once in each row and column:
What is orthogonality
The acknowledgement that one variable tell you nothing about the other variable
What is the benefit of orthogonal design
There is no statistical elimination between orthogonal explanatory variables
Which variables are easier for orthogonality
Easier for categorical variable than continuous variables
What is the first step in sizing up data?
Make a graph
In continuous explanatory variables what do the p-values in ANOVA table represent?
That each explanatory variable has no influence on the response variables
With continuous variables, what do the p-values represent in the coefficients table
That each specific coefficient value equals zero
In a continuous variable, what does the p-value mean overall?
Neither explanatory variables can be used to predict the response variable
In a categorical variable, what does the p-value mean in the ANOVA table?
That each explanatory variable has no influence in the response variable
In a categorical variable what does the p-value mean in the coefficients table mean?
Each specific coefficient value equals 0
In a categorical variable, what does the overall p-value mean in the coefficients table?
That neither of the variables can be used to predict the response variable
How does blocking affect residuals?
Blocking helps by reducing the size of the residuals. Increasing F but lowering P
What is an interaction
Two x-variables interact of the effect of one x-variable on y depends on the level of the other
Regarding interactions, what does non-parallel lines indicate
There is an interaction
Regarding interactions, what does two parallel line indicate?
There are no interactions
In which case are r-squared values high and low with and without interactions
Model without interactions have a low r- squared value while model with interactions have a higher-squared values