Stats Year 2 - Woman Flashcards
The larger the sample the … the sampling error
Smaller
The … variable is the variable measured to estimate the response.
Dependent variable
Type I error
Rejecting H0 when it was correct.
Type II error
Accepting H0 when it was incorrect.
Correlation requires two variables that are?
Continuous
Parametric tests assumme?
Normality
Tests for normality
- Shapiro-Wilk
- Kolmogorov-Smirnov
- Anderson-Darling
(H0 for all is that the distribution is normal)
One sample t-test
Compares a sample mean to a prestipulated value (population mean)
H0 is that the means are equal
Result for one-tailed t-test
(t)
Values of t are normally distributed, so closer to 0 the closer the chance is to having the same mean
Distribution of t can be calculated using df of sample
Can be made non-directional by looking for the probability of the modulus of t
Two sample t-test
Compares the means of two samples
H0 is that there is no significant difference between the samples.
Univariate vs Multivariate
How many dependent variables are being analysed?
One or many?
One way or multiple way?
How many independent variables are being used to test the dependent variables?
One or multiple?
Independent or repeated measures
How many times is the same subject used?
Assumptions of ANOVA
- Normality
- Homogeneity of variance
- All data points are independent
Normality in ANOVA
Assumes normality but fairly robust if sample sizes and variances are similar.
If not then use Kruskall-wallis
Equation for one-way ANOVA
F = variance between group means / average variance within groups
If groups are from the same population F should = 1
Hypotheses of one-way ANOVA
- Null hypothesis – population means are equal
- Alternative hypothesis – at least one mean is not equal.
Tukeys HSD test
- Post-Hoc test
- Popular but conservative
- Significant values are under 0.05
R^2
Indicates how much of the variance in observations can be explained by the model
R = SS[between]/SS[total]
SS
Sum of Squares = Sum (x-x(bar))^2
Two-way ANOVA
Tests whether two factors affect the dependent variable independently or if they interact with each other
Hypotheses of the two-way anova
Null 1: effect is the same height at all factor 1
Null 2: effect is the same height at all factor 2
Null 3: the effect of one factor is the same across all levels of the other factor.
Scaling interaction
The magnitude of the effect of one variable depends on the strength of the other variable.
Crossing interaction
the effect of one variable changes depending on the level of the other. The lines cross, indicating that the presence of both limits the effects of both.
Analysis of Covariance
ANCOVA
Used when we want to determine the effect of factors (discrete variables) and covariates (continuous variables) on dependent variables
Improves detection of factors by removing variance (error)
Covariate
A variable that you thought would affect the dependent variables, but you can’t manipulate, but you want to account for.
Error Variability
Comes from the subject within group deviation from the mean of the group.
Is smaller than the original error, uses the distance from regression line rather than from mean.
It is calculated on the basis of the sum of squares.
Assumptions of ANCOVA
- Normality of treatment levels
- Independence of variance estimates
- Other same assumptions as with ANOVA
- Linear regression of covariables
- Linear regression is the same for both groups (homogeneity)
In regression the residual sum of squares is…
… is based on the deviation of score from the regression line
* The residual sum of squares will be smaller than the unadjusted sum of squares.
* This regression shrinks the variance of each data group so that they can be compared better
Fixed Effects
If I repeated the experiment would I use the same values?
Values stated in the hypothesis.
Random Effects
Exact values of this don’t matter, we could use any of a selection.
Look at changes over a range of temperatures for example.
Why do random vs fixed effects matter?
In addition to the error in a test or model there is also error in the random sampling of temperatures for example.
It will effect whether or not something is significant, software can be used to take this into account.