Year 1 Flashcards
What are the four scales of measurement?
Nominal, ordinal, interval, ratio
What distinguishes ordinal and interval?
Ordinal has a natural ordering to the data
What distinguishes interval and ratio?
Interval does not have an absolute zero point but ratio data does
What are the 4 movements of a distribution?
- Central tendency
- Dispersion
- Skewness
- Kurtosis
Negative skew?
Mode > mean
Positive skew?
Mode < mean
platykurtic, mesokurtic, leptokurtic?
Platykurtic (k<3)
Mesokurtic (k~3)
Leptokurtic (k>3)
Parametric data analysis requirements?
- continuous data
- n>30
- normally distributed
Non-parametric data analysis requirements?
- not continuous
- n<30
- non-normal
What process allows us to begin conducting arithmetic on parametric data?
Normalisation or standardisation
What happens to the mean and SD after you carry out normalisation on a distribution?
Mean = 0 SD = 1
What way is a null hypothesis always phrased?
Negatively
who decides the level of significance associated with hypothesis testing?
user based on opinion and consideration of distribution characteristics
When is something classed as not statistically significant?
If the significance value falls outside the significance confidence threshold.
What are the 2 ways of determining whether a distribution is normally distributed?
Q-Q plot
K-S test
How does a q-q plot work?
Points should lie as closely along the line (representing a normal distribution) and be evenly distributed either side.
How does a K-S test work?
Hypothesis testing - if the value returned lies outside the significance threshold then there is NOT a statistically significant difference between a normal distribution and the investigated normal distribution i.e. it is normally distributed
What are inferential statistics?
Tests of difference between either samples and populations
What are the 3 parametric inferential statistics?
On-sample t-test = sample and population
Two-sample t-test = sample and sample
ANOVA = 2+ samples
What are relational statistics?
Testing for a relationship between variables i.e. correlation.
What is the parametric relational statistic test?
Pearson’s correlation coefficient
What are the 4 non-parametric statistical tests?
one-way chi = sample and population
two-way chi = 2+ sample
MWU = comparison of sample means
Kruskal Wallis = ANOVA
What is the non-parametric relational statistic test?
Spearman’s Rank
What is the ‘least squares criterion’?
the principle that the total difference between the points and the regression line is as small and identical either side.
What is the f-ratio?
the ratio of explained variance to unexplained variance
What is the coefficient of explanation for linear regression?
R squared
What are two sources of regression error?
Standard error = error associated with how difficult it is to represent data that is naturally very tricky
sampling error = concerned with the regression line characteristics being incorrect or poor at representing data
What is homoscedascity?
When residuals from the regression line are consistently spaced either side of the regression line.
Why is homoscedascity important?
because that forms one of the assumptions held regarding analysis - that there are even residuals either side of the regression line
How do we test for homoscedascity/heteroscedascity in spss?
- scatter plot needs to be well scattered with no clear patterns
- P-plot needs to have similar amount of points either side of line and be well tied to the line
- histogram needs to be normal in style (Gaussian)
What is autocorrelation?
When each correlation between x and y values are not independent i.e. the correlation is affected by something else
What is wrong with autocorrelation?
It is assumed to not occur in our parametric tests
What test do we use for testing for autocorrelation?
durbin watson
What is the range of values for significant positive autocorrelation, no autocorrelation and significant negative autocorrelation?
positive autocorrelation = 0-1.475
no autocorrelation = 1.566-2.434
negative autocorrelation = 2.525 - 4
what is the difference between autocorrelation and multicollinearity?
autocorrelation involves the correlation between one predictor and y being affected by something else whereas multicollinearity is when different predictors are linked so that distinguishing their individual impact on y is difficult to determine