Week 3 - Parametric test assumptions Flashcards by Nicky .

What are the features of a parametric test?

assess group means
data must have normal distribution (+CLT)
unequal variances allowed
more powerful

How well did you know this?

Not at all

Perfectly

What are the features of a non-parametric test?

e.g. correlation tests

assess group MEDIANS
data doesn’t need to be normally distributed
can handle small sample size

How well did you know this?

Not at all

Perfectly

Questions to ask yourself when deciding to use a parametric test or not

sample size

- best way to measure central distribution (e.g. median or mean?)

How well did you know this?

Not at all

Perfectly

What are the parametric test assumptions? (4)

Additivity and linearity
Normality (Gaussian distribution/Bell curve)
Homogeneity of variances
Independence of observations

How well did you know this?

Not at all

Perfectly

Describe the assumption of Additivity and linearity

Involves a standard linear model/ equation (describing a straight line)

How well did you know this?

Not at all

Perfectly

What is the Standard linear model equation

Yi - b0 + b1X1+Ei

Yi= the ith person’s score on the outcome variable

B0= Y-intercept. value of Y when X = 0. point at which the regression line crosses the y-axis

B1 = regression coefficient for the first predictor (B2 for the second predictor).

Gradient (slope/ rise over run) of the regression
Direction/ strength of relationship

Ei= the difference between the actual and predicted value of Y for the ith person
- residual/ error

How well did you know this?

Not at all

Perfectly

What does it mean for data to be linear and additive?

X1 and X2 predict Y.
The outcome is a linear function of the predictors (X1 + X2)
predictors are added together & do not depend on values of other variables in as in a multiplicative model

The outcome Y is an additive combination of the effects of X1 and X2. e.g. as both X1 and X 2 increase, Y increases also

How well did you know this?

Not at all

Perfectly

True or false:

The outcome Y is an additive combination of the effects of X1 and X2. e.g. as both X1 and X 2 increase, Y increases also

true

How well did you know this?

Not at all

Perfectly

How can we assess linearity?

plot observed vs predicted values (symmetrically distributed around diagonal line)
plot residuals vs predicted values (symmetrically distributed around diagonal line)

How well did you know this?

Not at all

Perfectly

How to fix non-linear equations?

nonlinear transformation to variables
another regressor that is nonlinear - function - polynomial curve
examine moderators

How well did you know this?

Not at all

Perfectly

Describe the assumption of Normality

relevant to:

parameters (sampling distribution)
residuals/ error terms
-> confidence intervals around parameter
-> Null hypothesis significance testing

How well did you know this?

Not at all

Perfectly

What is Central Limit Theorem (CLT)?

As the sample size increases toward infinity (gets larger), the sampling distribution approaches normal.

–> sample means will be normally distributed thus you don’t need to worry too much about the distribution that the samples came from.

–> distribution of means from many samples and re-samples

–>sample size must be AT LEAST 30

How well did you know this?

Not at all

Perfectly

For CLT to apply, what size must the sample size be?

At least 30

How well did you know this?

Not at all

Perfectly

True or false
According to CLT -
Even if the data is not normal, the sampling distribution of the data will be normal

True

How well did you know this?

Not at all

Perfectly

True or false

Positively skewed data gathers on the left side and scores bunch at the low values with tails pointing to high values

true

How well did you know this?

Not at all

Perfectly

True or false

Negatively skewed data gathers on the left side and scores bunch at the low values

false - it gathers on the left (e.g. as you grow conditions get “worse” in life)

they bunch at the high values with tails pointing to low values

How well did you know this?

Not at all

Perfectly

What is kurtosis?

The amount which data clusters in either the tails (ends) or the peak (tallest part) of the distribution

heaviness of tails

How well did you know this?

Not at all

Perfectly

Draw the following:
Negative Kurtosis
Positive Kurtosis
Normal distribution

Leptokurtic (heavy tails)
Mesokurtic
Platykurtic (light tails)

Study These Flashcards

draw on paper

What are properties of frequency distributions?

Study These Flashcards

Skewness

- Kurtosis

Checking the distribution to determine if the assumption of normality is met is important. Which graphical displays are used to test for normality?

Study These Flashcards

Q-Q plots (dots on straight line = normal)

Histograms

What is the name for the software (e.g. JASP) based method for testing for normality?

Study These Flashcards

Shapiro Wilkes Test

Describe the Shapiro Wilkes Test and what a p value of <0.05 means

Study These Flashcards

tests if data is different from normal distribution

- p < 0.05 = data varies significantly from normal distribution thus normality is violated

In Shapiro Wilkes Test, what does a p value >0.05 mean?

Study These Flashcards

Data des not vary significantly from a normal distribution thus the normality assumption is not violated

Describe the assumption of homogeneity of variance

Study These Flashcards

Assumes all groups or data points have the same or equal variances = the assumption of equal variances

What does homoscedasticity mean?

All groups have equal/ similar variances

What does hetroscedasticity mean?

All data points/ groups do NOT have equal variances. = unequal variances

Define the "error"

The variance from the residual line Error from what we predicted the y would be based on its X value and what we actually observed from the true data

Describe the assumption of independence of observation

Assumes that you do not have repeated measures of data. - residuals (errors) are unrelated - assume based on study design

According to the assumption of independence of observations, what happens when observations are non-independent?

results in downwardly biased standard errors. (too small) thus incorrect statistica inferences (p values < 0.05 when they should be > 0.05) --> false significant p values ---> this is why it is important to know study design ---> important for mean values of the outcome to come from a different person or other unit (e.g. family, school)

What is is an univariate outlier?

outlier when considering only the distribution of the variable it belongs to

What is a bivariate outlier?

outlier when considering the joint distribution of two variables - breaking away from the pattern of the association between two variables

What is a multivariate outlier?

outliers when simultaneously considering multiple variables.

What type of outlier is difficult to asses using numbers or graphs?

multivariate outliers

What types of outliers bias the mean and inflate the standard deviation?

Univariate outliers

What types of outliers bias the RELATIONSHIP between two variables e.g. change the strength

bivariate outliers

What are the three ways to deal with outliers?

REMOVE the case or trim the data TRANSFORM the data CHANGE the score (winsorizing) pulling the data in e.g. biological data (must be transparent about it when reporting results)

What are some reasons for transforming data?

1. ease of interpretation - standardisation e.g. z -scores allow for simpler comparisons 2, reducing skewness - closer to normality 3. equalising spread/ improving homogeneity of variances 4. linearising relationships between variables - to fit non-linear relationships into linear models 5. making relationships additive therefore fulfilling assumptions for certain tets

Do linear transformations change the shape of the distribution ? What do they change?

No Changes the value of the mean/ SD but shape remains unchanged

How do linear transformations work?

- adding constant to each number, x + 1 - converting raw scores to z-scores (x-m)/SD - mean centring, x- m

What type of transformation changes the shape of the distribution?

non-linear transformations - Log, log(X) or ln(x) - Square root of x - Reciprocal, 1/x

Can you use a log transformation [log(x)] on data with positive values and if you want to reduce positive skew and stabilise variance?

yes

When would you use a square root transformation?

- reduce positive skew - stabilise variance - defined for zero/ positive values

When would you use a reciprocal transformation?( 1/x)

- reduce impact of large scores - stabilise variance - it reverses the scores so this can be avoided by reversing the scores before transforming 1/ (Xhighest - X lowest)

What are the negatives of transforming data?

- non-linear transformations (used to normalise distribution e.g. log, square root, reciprocal ) CHANGE the data & results --> 1 unit increase on the natural log scale might be different - Transformation can hider if wrong transformation applied - Makes interpretation difficult (dealing with raw sores and transformed)

Week 3 - Parametric test assumptions Flashcards

(44 cards)