Parametric test assumptions Flashcards

1
Q

Define

Parametric test

A

tests that make assumptions about the parameters of the population distribution from which the sample is drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define

Outlier

A

a data point that differs significantly from other observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define

Linear transformation

A

a function from one vector space to another that respects the underlying (linear) structure of each vector space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define

Non-parametric test

A

tests don’t assume that your data follow a specific distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define

Central limit theorem

A

states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define

Normality

A

the sampling distribution of the mean is normal or that the distribution of means across samples is normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define

Homogeneity of variance

A

the assumption that all groups have the same or similar variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define

Independence

A

means that your data isn’t connected in any way (at least, in ways that you haven’t accounted for in your model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define

Residual

A

The difference between the observed value of the dependent variable (y) and the predicted value (ŷ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define

Kurtosis

A

a measure of the combined weight of a distribution’s tails relative to the center of the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define

Leptokurtic

A

having greater kurtosis than the normal distribution; more concentrated about the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define

Mesokurtic

A

having the same kurtosis as the normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define

Platykurtic

A

a statistical distribution in which the excess kurtosis value is negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define

Shapiro Wilkes Test

A

a test that examines if a variable is normally distributed in a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define

Q-Q Plot

A

a scatterplot created by plotting two sets of quantiles against one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define

Univariate outlier

A

outlier when considering only the distribution of the variable it belongs to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define

Bivariate outlier

A

outlier when considering the joint distribution of two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define

Multivariate outlier

A

outliers when simultaneously considering multiple variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define

Log transformation

A

A type of transformation that can be used to reduce positive skew and stabilise variance and is only defined for positive values > 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define

Square root transformation

A

A type of transformation that can be used to reduce positive skew and stabilise variance. It is defined for zero and positive values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define

Reciprocal transformation

A

A type of transformation that can reduce the impact of large scores and stabilize variance. Transformation reverses the scores, but can be avoided by reversing the scores before transforming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Definition

tests that make assumptions about the parameters of the population distribution from which the sample is drawn

A

Parametric test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Definition

a data point that differs significantly from other observations

A

Outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Definition

a function from one vector space to another that respects the underlying (linear) structure of each vector space

A

Linear transformation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Definition

tests don’t assume that your data follow a specific distribution

A

Non-parametric test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Definition

states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed

A

Central limit theorem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Definition

the sampling distribution of the mean is normal or that the distribution of means across samples is normal

A

Normality

28
Q

Definition

the assumption that all groups have the same or similar variance

A

Homogeneity of variance

29
Q

Definition

means that your data isn’t connected in any way (at least, in ways that you haven’t accounted for in your model)

A

Independence

30
Q

Definition

The difference between the observed value of the dependent variable (y) and the predicted value (ŷ)

A

Residual

31
Q

Definition

a measure of the combined weight of a distribution’s tails relative to the center of the distribution

A

Kurtosis

32
Q

Definition

having greater kurtosis than the normal distribution; more concentrated about the mean

A

Leptokurtic

33
Q

Definition

having the same kurtosis as the normal distribution

A

Mesokurtic

34
Q

Definition

a statistical distribution in which the excess kurtosis value is negative

A

Platykurtic

35
Q

Definition

a test that examines if a variable is normally distributed in a population

A

Shapiro Wilkes Test

36
Q

Definition

a scatterplot created by plotting two sets of quantiles against one another

A

Q-Q Plot

37
Q

Definition

outlier when considering only the distribution of the variable it belongs to

A

Univariate outlier

38
Q

Definition

outlier when considering the joint distribution of two variables

A

Bivariate outlier

39
Q

Definition

outliers when simultaneously considering multiple variables

A

Multivariate outlier

40
Q

Definition

A type of transformation that can be used to reduce positive skew and stabilise variance and is only defined for positive values > 0

A

Log transformation

41
Q

Definition

A type of transformation that can be used to reduce positive skew and stabilise variance. It is defined for zero and positive values

A

Square root transformation

42
Q

Definition

A type of transformation that can reduce the impact of large scores and stabilize variance. Transformation reverses the scores, but can be avoided by reversing the scores before transforming

A

Reciprocal transformation

43
Q

What is the difference between parametric and non-parametric tests?

A

Parametric:

  • Assess group means
  • Require that your data follow the normal distribution
    • Except for large sample sizes due to central limit theorem
  • Can deal with unequal variances across groups
  • More powerful

Non-parametric:

  • Assess group medians
  • Don’t require that your data follow the normal distribution
  • Can deal with small sample sizes
44
Q

When deciding between a parametric and non-parametric test what questions should you ask yourself?

A

What is the best central tendency measure for your data?

What is your sample size?

45
Q

Parametric tests are based on the normal distribution and have what assumptions?

A
  1. Additivity and linearity
  2. Normality
  3. Homogeneity of variance
  4. Independence
46
Q

What is the standard linear model equation and what do the variables represent?

A

Yi = b0 + b1x1 + b2x2 + ei

  • Yi =* outcome variable
  • b0* = y-intercept
  • x1 & x2 =* predictor variables
  • b1 & b2 * = slope of predictors
  • ei* = error
47
Q

True or False:

In the Standard linear model, the slope of effect of one predictor does not depend on the values of other variables.

A

True

48
Q

What does linear and additive mean about variables x1, x2 and y?

A

Linear and additive data mean that x1 and x2 predict y

The outcome y is an additive combination of the effects of x1 and x2; it looks like y increases as both x1 and x2 increases.

49
Q

How can we assess linearity?

A
  • Plot of observed vs predicted values (symmetrically distributed around diagonal line)
  • Plot of residuals vs predicted values (symmetrically distributed around horizontal line)
    • look out for bow shape to know that you have violated
50
Q

How do you fix when additivity and linearity are voided?

A
  • Apply nonlinear transformation to variables
  • Add another regressor that is a nonlinear function – polynomial curve
  • Examine moderators
51
Q

What is sample size is large enough for the central limit theorem to apply?

A

>30 participants

52
Q

Is this positively or negatively skewed?

A

Negative

53
Q

What are the three types of kurtosis (in order of increasing central value height)?

A

Platykurtic (negative)

Mesokurtic (normal distribution)

Leptokurtic (positive)

54
Q

When assessing normality what graphical displays do we use to check data or residuals?

A

Q-Q plot

Histogram

55
Q

What are the two main tests of normality?

A

Shapiro Wilkes test

Q-Q plot

56
Q

What does a Shapiro Wilkes test do?

A
  • Tests if data differ from normal distribution
  • Statistically significant (p < .05) → data varies significantly from a normal distribution (i.e., normality assumption is violated)
  • Not statistically significant (p > .05) → data does not vary significantly from a normal distribution (i.e., normality assumption is not violated)
57
Q

What does a Q-Q plot do?

A
58
Q

What are the three types of outliers?

A

Univariate

Bivariate

Multivariate

59
Q

What type of outlier is this?

A

Univariate for both variables

60
Q

What type of outlier is this?

A

Bivariate

61
Q

How do you deal with outliers?

A
  • Remove the case or trim the data
  • Transform the data
  • Change the score (known as winsorizing):
    • Change the score to the next highest value plus some small number (e.g., 1, or whatever is appropriate to the scale of the data)
    • Convert the score to that expected for a z-score of +-3.29
    • Convert the score to the mean plus 2 or 3 standard deviations
    • Convert the score to a percentile of the distribution (e.g., 0.5th or 99.5th percentile)
62
Q

Why is it a good idea to transform data?

A
  1. For convenience or ease of interpretation – standardisation, e.g. z scores allow for simpler comparisons
  2. Reducing skewness – help get closer to meeting normality assumption
  3. Equalising spread or improving homogeneity of variance – produce approximately equal spreads
  4. Linearising relationships between variables – to fit non-linear relationships into linear models
  5. Making relationships additive and therefore fulfilling assumptions for certain tests
63
Q

What is the difference between a linear and non-linear transformation?

A

Linear transformations do not change the shape of the distribution. It may change the value of the mean and/or standard deviation, but the shape of the distribution remains unchanged.

Non-linear transformations change the shape of the distribution

64
Q

What are some examples of linear transformations?

A

Adding a constant to each number, x + 1

Converting raw scores to z-scores, (x – m)/SD

Mean centring, x – m

65
Q

What are some examples of non-linear transformations?

A

Log, log(x) or ln(x)

Square root, 𝑋

Reciprocal, 1/x