Statistics II Flashcards
What does ANOVA stand for?
Analysis of Variance
What are ANOVA models?
Statistical models used to analyze the differences between group means and their associated procedures (such as “variation” among and between groups).
What does ANOVA provide in its simplest form?
ANOVA provides a statistical test of weather or not the means of several groups are equal, and therefore generalizes t-test to more than two groups.
Why should a t-test not be used, if there are more than two groups?
Doing multiple two-sample t-tests would result in an increased chance of committing a type I error.
What does MANOVA stand for?
Multivariate Analysis of Variance
When is MANOVA used instead of ANOVA?
It is used when there are two or more dependent variables.
What questions can MANOVA help to answer?
- Do changes in the independent variable(s) have significant effects on the dependent variables?
- What are the interactions among the dependent variables?
- What are the interactions among the independent variables?
What does MANCOVA stand for?
Multivariate Analysis of Covariance
What does it mean to have a problem with an unbalanced design in an ANOVA test?
the groups are of different sizes
What is a fixed factor?
A factor only occurring in previously fixed values.
e.g. medicine (1, 2, 3)
species (raven, crow)
What is a random factor?
randomly selected values out of a population of values
When repeating an experiment, what factors would have the same values?
the fixed factors
What does “Paired Comparison” mean?
before/after - design
both measurements must be taken from the same subject
e.g. blood pressure before and after training
behavior in environment A and environment B
What does it mean to have “Repeated Measures”?
several effects on each subjects
e.g. several drugs on each subject
What are the advantages of “paired comparison” and “repeated measurement” designs?
Variance between individuals can be ignored.
Smaller effects can be measured, which would usually be cancelled out by inter-subject variance.
Explain the following abbreviation in paired comparisons:
Xi1
Xi2
Di
Xi1 … value for individual i before
Xi2 … value for individual i after
Di = Xi1 - Xi2 … difference for individual i
Basic branches of applied statistics:
descriptive statistics inferential statistics (hypothesis testing, confirmatory ... ) exploratory analysis, modeling, data mining
Why is it called inferential statistics?
Inferences on the whole population are drawn from sample.
Examples of common quantiles?
median upper quartile lower quartile deciles percentiles
“Bad” data can usually be classified as either
… or
…
incomplete or
incorrect
List two potentially serious weaknesses of discarding incomplete records in a data set!
1) possibility of selection bias distortions
2) dramatic reduction in the size of data set
Two ways of handling missing data:
1) discard incomplete records
2) insert substitute values
One problem with using substituted values for incomplete data:
Essentially we would be making up data.
An outlier is …
… a value that is very different from the others, or from what is expected.
Difference between experimental and observational studies (and data)?
In experimental studies objects are manipulated (e.g. subjects taking different amounts of a drug).
In observational studies data is just recorded (e.g. telephone surveys, data about distant galaxies).
What does it mean if the allocation of subjects to test groups is “double blind”?
During the experiment neither subject nor experimenter know in which group the subject is in.
An experiment with two conditions can be regarded as a …
… simple two-group experiment.