Parametric test assumptions Flashcards

1
Q

Define

Parametric test

A

tests that make assumptions about the parameters of the population distribution from which the sample is drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define

Outlier

A

a data point that differs significantly from other observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define

Linear transformation

A

a function from one vector space to another that respects the underlying (linear) structure of each vector space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define

Non-parametric test

A

tests don’t assume that your data follow a specific distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define

Central limit theorem

A

states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define

Normality

A

the sampling distribution of the mean is normal or that the distribution of means across samples is normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define

Homogeneity of variance

A

the assumption that all groups have the same or similar variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define

Independence

A

means that your data isn’t connected in any way (at least, in ways that you haven’t accounted for in your model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define

Residual

A

The difference between the observed value of the dependent variable (y) and the predicted value (ŷ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define

Kurtosis

A

a measure of the combined weight of a distribution’s tails relative to the center of the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define

Leptokurtic

A

having greater kurtosis than the normal distribution; more concentrated about the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define

Mesokurtic

A

having the same kurtosis as the normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define

Platykurtic

A

a statistical distribution in which the excess kurtosis value is negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define

Shapiro Wilkes Test

A

a test that examines if a variable is normally distributed in a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define

Q-Q Plot

A

a scatterplot created by plotting two sets of quantiles against one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define

Univariate outlier

A

outlier when considering only the distribution of the variable it belongs to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define

Bivariate outlier

A

outlier when considering the joint distribution of two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define

Multivariate outlier

A

outliers when simultaneously considering multiple variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define

Log transformation

A

A type of transformation that can be used to reduce positive skew and stabilise variance and is only defined for positive values > 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define

Square root transformation

A

A type of transformation that can be used to reduce positive skew and stabilise variance. It is defined for zero and positive values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define

Reciprocal transformation

A

A type of transformation that can reduce the impact of large scores and stabilize variance. Transformation reverses the scores, but can be avoided by reversing the scores before transforming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Definition

tests that make assumptions about the parameters of the population distribution from which the sample is drawn

A

Parametric test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Definition

a data point that differs significantly from other observations

A

Outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Definition

a function from one vector space to another that respects the underlying (linear) structure of each vector space

A

Linear transformation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
# Definition tests don't assume that your data follow a specific distribution
Non-parametric test
26
# Definition states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed
Central limit theorem
27
# Definition the sampling distribution of the mean is normal or that the distribution of means across samples is normal
Normality
28
# Definition the assumption that all groups have the same or similar variance
Homogeneity of variance
29
# Definition means that your data isn't connected in any way (at least, in ways that you haven't accounted for in your model)
Independence
30
# Definition The difference between the observed value of the dependent variable (y) and the predicted value (ŷ)
Residual
31
# Definition a measure of the combined weight of a distribution's tails relative to the center of the distribution
Kurtosis
32
# Definition having greater kurtosis than the normal distribution; more concentrated about the mean
Leptokurtic
33
# Definition having the same kurtosis as the normal distribution
Mesokurtic
34
# Definition a statistical distribution in which the excess kurtosis value is negative
Platykurtic
35
# Definition a test that examines if a variable is normally distributed in a population
Shapiro Wilkes Test
36
# Definition a scatterplot created by plotting two sets of quantiles against one another
Q-Q Plot
37
# Definition outlier when considering only the distribution of the variable it belongs to
Univariate outlier
38
# Definition outlier when considering the joint distribution of two variables
Bivariate outlier
39
# Definition outliers when simultaneously considering multiple variables
Multivariate outlier
40
# Definition A type of transformation that can be used to reduce positive skew and stabilise variance and is only defined for positive values \> 0
Log transformation
41
# Definition A type of transformation that can be used to reduce positive skew and stabilise variance. It is defined for zero and positive values
Square root transformation
42
# Definition A type of transformation that can reduce the impact of large scores and stabilize variance. Transformation reverses the scores, but can be avoided by reversing the scores before transforming
Reciprocal transformation
43
What is the difference between parametric and non-parametric tests?
**Parametric:** * **​**Assess group means * Require that your data follow the normal distribution * Except for large sample sizes due to central limit theorem * Can deal with unequal variances across groups * More powerful **Non-parametric:** * Assess group medians * Don’t require that your data follow the normal distribution * Can deal with small sample sizes
44
When deciding between a parametric and non-parametric test what questions should you ask yourself?
What is the best central tendency measure for your data? What is your sample size?
45
Parametric tests are based on the normal distribution and have what assumptions?
1. Additivity and linearity 2. Normality 3. Homogeneity of variance 4. Independence
46
What is the standard linear model equation and what do the variables represent?
*Yi = b0 + b1x1 + b2x2 + ei* ## Footnote * Yi =* outcome variable * b0* = y-intercept * x1 & x2 =* predictor variables * b1 & b2 * = slope of predictors * ei* = error
47
True or False: In the Standard linear model, the slope of effect of one predictor does not depend on the values of other variables.
True
48
What does linear and additive mean about variables x1, x2 and y?
Linear and additive data mean that x1 and x2 predict y The outcome y is an additive combination of the effects of x1 and x2; it looks like y increases as both x1 and x2 increases.
49
How can we assess linearity?
* Plot of observed vs predicted values (symmetrically distributed around diagonal line) * Plot of residuals vs predicted values (symmetrically distributed around horizontal line) * look out for bow shape to know that you have violated
50
How do you fix when additivity and linearity are voided?
* Apply nonlinear transformation to variables * Add another regressor that is a nonlinear function – polynomial curve * Examine moderators
51
What is sample size is large enough for the central limit theorem to apply?
\>30 participants
52
Is this positively or negatively skewed?
Negative
53
What are the three types of kurtosis (in order of increasing central value height)?
Platykurtic (negative) Mesokurtic (normal distribution) Leptokurtic (positive)
54
When assessing normality what graphical displays do we use to check data or residuals?
Q-Q plot Histogram
55
What are the two main tests of normality?
Shapiro Wilkes test Q-Q plot
56
What does a Shapiro Wilkes test do?
* Tests if data differ from normal distribution * Statistically significant (p \< .05) → data varies significantly from a normal distribution (i.e., normality assumption is violated) * Not statistically significant (p \> .05) → data does not vary significantly from a normal distribution (i.e., normality assumption is not violated)
57
What does a Q-Q plot do?
58
What are the three types of outliers?
Univariate Bivariate Multivariate
59
What type of outlier is this?
Univariate for both variables
60
What type of outlier is this?
Bivariate
61
How do you deal with outliers?
* Remove the case or trim the data * Transform the data * Change the score (known as winsorizing): * Change the score to the next highest value plus some small number (e.g., 1, or whatever is appropriate to the scale of the data) * Convert the score to that expected for a z-score of +-3.29 * Convert the score to the mean plus 2 or 3 standard deviations * Convert the score to a percentile of the distribution (e.g., 0.5th or 99.5th percentile)
62
Why is it a good idea to transform data?
1. For convenience or ease of interpretation – standardisation, e.g. z scores allow for simpler comparisons 2. Reducing skewness – help get closer to meeting normality assumption 3. Equalising spread or improving homogeneity of variance – produce approximately equal spreads 4. Linearising relationships between variables – to fit non-linear relationships into linear models 5. Making relationships additive and therefore fulfilling assumptions for certain tests
63
What is the difference between a linear and non-linear transformation?
**Linear transformations** do not change the shape of the distribution. It may change the value of the mean and/or standard deviation, but the shape of the distribution remains unchanged. **Non-linear transformations** change the shape of the distribution
64
What are some examples of linear transformations?
Adding a constant to each number, x + 1 Converting raw scores to z-scores, (x – m)/SD Mean centring, x – m
65
What are some examples of non-linear transformations?
Log, log(x) or ln(x) Square root, 𝑋 Reciprocal, 1/x