3 - Fundamental Skills Flashcards
What is a covariate and a factor?
Covariate - independent, not primary variable. This is what is measured - quantitative. It can be discrete (count), or continuous (measurement).
Factor - categorical, qualitative - sort by data. It can be nominal (class) or ordinal (size).
What is the p-value?
The probability of the test statistic being that extreme or more, if the null is true.
How to tell when to use the mean versus the median?
Median - may be more representative as outlier may skew mean.
Mean - normal distribution.
What is the difference in variation shown by the range and IQR?
Range - whole data variation.
IQR - less susceptible to outliers.
How do we test correlation?
Two covariates in a scatterplot.
How do we test causality?
Two covariates affecting one another in a GLM scatterplot.
How do we test association?
Two factors affected each other in a chi square bar plot.
How do we test the means?
Test if means are statistically different in t-test box plot.
What kinds of response and explanatory variables are tested in models?
GLM - response (covariate), explanatory (covariate or factor).
T-test - explanatory (factor).
(Multiple) Regression - explanatory (covariate).
What is partitioning?
We can see the total variation and what is and isn’t explained by a particular variable.
What is R-squared?
How much of the response variation is explained by the explanatory variable - standardised volume explained by model.
When would we use the adjusted R-squared?
If there are multiple explanatory variables.
What is the f-ratio?
The mean SS for each explanatory variable divided by the mean RSS. Each has its own f-ratio, and in GLMS, the probability of the F being high or higher is if the null is true.
Sum of Squares.
- Calculate deviation - (mean-value)
- Square the deviation (so it no longer totals to 0)
- Total (sum)
R-squared calculation.
ESS (explained)/TSS (total)