Parametric Tests and Assumptions Flashcards

1
Q

What do parametric tests assess?

What is required to run them?

A
  • Parametric tests look at group means
  • Require data to follow a normal distribution
  • Can deal with unequal variances across groups
  • Generally are more powerful
  • still produce reliable results with continuous data not normally distributed if sample size requirements met (CLT)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If data does not meet parametric assumptions what non parametric tests would you use?

A
  • Correlation tests, which are non parametric versions/ So for example, a Spearman’s Correlation Test.
  • Non parametric tests also assess group means, they just don’t require a normal distribution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the loop hole with parametric tests when continuous data is not normally distributed (therefore according to assumptions, perhaps should choose non parametric one?)

A
  • Loophole is that sample size requirements are met due to central limit theorem. In these cases you can still produce reliable results.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do non parametric tests assess?

How is this different to parametric tests?

A
  • Group MEDIANS
  • Don’t require data be normally distributed
  • Can handle small sample sizes

Because parametric tests assess group means. They require a larger sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is one easy question to ask ourselves when figuring out whether to choose parametric or non parametric?

A

What is the sample size we’re working with.

Non parametric can deal with small sample sizes, parametric not so much..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the four parametric test assumptions?

A
  • Additivity and linearity
  • Normality
  • Homogeneity of variance
  • Independence of observations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is this equation?

y(i) = b(0) + b(1)X(1) + e(i)

A

This is the standard linear model (that describes a straight line), and we see this when looking at additivity and linearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the Y, B(0) and B(1) and E(i) stand for in the below?

y(i) = b(0) + b(1)X(1) + e(i)

A
Y(i) = the Xth persons score on the outcome variable
B(0) = The Y intercept - the value of Y when X = 0
B(1) = the regression coefficient for the first predictor (so the gradient of the regression line (slope) and the strength of the relationship
e = the difference between the actual and predicted value of the Y for the (i)th person.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the standard linear model equation describe?

A

Both the direction and the strength between the ASSOCIATION of the X and Y variable. Always have an error term at the end.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the E at the end of the standard regression equation represent

A

The difference between the actual observed data point and the LINE the we drew in the data points. That’s each data point (or persons) residual or error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In parametric tests are we adding terms together or multiplying? If so, why?

A

Because predictors do not DEPEND on the values of other variables.
We use additive data, so x1 and x2 predict T.
So the predictors (variables) and their effect, added together, lead to an outcome which is a linear function of predictors x1 + x2.
Basically linear and additive data say X1 and X2 predict Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Basically, what does linear and additive allude to?

A

That x1 and x2 predict y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why are variables not multiplied in linear equations?

A

Because we are looking at linear relationships which involve adding terms together. Not multiplying. Adding the predictors together says that the outcome, or DV, is a linear function of the predictors AND their effects

b(0) + b(1)X(1) + e(i)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we deal with assumptions for ANOVA?

A
  • Independent observations: Repeated measures
  • Normality – transform or use Kruskal Wallis
  • Homogeneity of variances – test with Levene’s test, use Brown-Forsythe or Welch F
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do we deal with assumptions for correlations?

A
  • Normality – Use Spearman correlation

* Linearity: if monotonic, use Spearman, otherwise transform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do we deal with assumptions for regression?

A

• Continuous outcome (otherwise use nonlinear methods)
• Non-zero variance in predictors
• Independent observations: Repeated measures
• Linearity – check with partial regression plots, try transforming
• Independent errors: For any pair of observations, the error terms should be
uncorrelated
• Normally-distributed errors: The errors (i.e., residuals) should be random and
normally distributed with a mean of 0
• Homoscedasticity: For each value of the predictors, the variance of the error term
should be constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do we deal with assumptions for multiple regression?

Refer to Multiple
Regression lecture
slides #19-32

A

The above, and also multicollinearity – delete or combine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do we deal with assumptions for moderation?

A
  • One IV must be continuous (if both X and M are categorical, use factorial ANOVA)
  • Each IV and Y, and interaction term and Y, should be linear – try transforming
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why would the best central tendency measure for your data sometimes be a median, and other times be a mean?

A

Generally the mean is best but media is preferred measure of central tendency when there are a few extreme scores in the distribution of the data (a single outlier can have a great effect on the mean)

Or, perhaps there are some missing values in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does the Gaussian distribution or bell curve mean?

A

Normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the four assumptions for parametric tests?

A

Additivity and linearity
Normality
Homogeneity of variance
Independence of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

y(i) = b(0) + b(1)X(1) + e(i)

What is this equation telling us? Which parametric test assumption is it associated with?

A

THE STANDARD LINEAR MODEL for additivity and linearity

Y(i) = the Xth persons score on the outcome variable
B(0) = The Y intercept - the value of Y when X = 0
B(1) = the regression coefficient for the first predictor (so the gradient of the regression line (slope) and the strength of the relationship
e = the difference between the actual and predicted value of the Y for the (i)th person.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

With the standard linear model, how many X variables can be added to an equation for a straight line?

A

However many as you like!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is Y in the standard linear model equation?

A

The outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does the little i next to the y(i) in the below equation represent?

y(i) = b(0) + b(1)X(1) + e(i)

A

Each individual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What does the b(0) represent in the below equation?

y(i) = b(0) + b(1)X(1) + e(i)

A

the Y-intercept (Value of Y when X=0)

Most importantly, it is the POINT at which the regression line crosses the X axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What does b(1) represent in the below equation?

A

It’s the first predictor, but more specifically it’s the regression coefficient for this predictor. It’s the EFFECT. Regression coefficient = effect.

It’s the SLOPE of the regression line, and it’s the direction/strength of the relationship.

It’s the direction and strength of the magnitude between the ASSOCIATION of the x and y variables . So we would repeat this for another x2. so b(2) would become the effect for that X

So that’s why it’s the effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does the e(i) represent in the below equation?

y(i) = b(0) + b(1)X(1) + e(i)

A

The e(i) is the difference between the actual and predicted value of Y for the ith person.

It’s the DIFFERENCE between the actual data point and the line that we drew in the data points - it’s each persons residual or error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Why are error terms and residuals important?

A

Because we can’t predict everything perfectly. Plotting true data points won’t always follow a straight line. They will fall a bit off the line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the outcome y telling us about x1 and x2 and the association?

A

That X1 and X2 predicts y. And that Y is an outcome of the additive combination of the EFFECTS of X1 and X2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

So we’ve looked at what additivity means, but how can we assess linearity? How do we know if a relationship is a straight line?

A
  • By plotting the observed vs. predicted values (where we would want to see them symmetrically distributed around a diagonal line) (like QQ plot)
  • By plotting residuals vs predicted values (when you have horizontal line and symmetrically distributed dots around it)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

When observing the plots showing QQ and residuals vs predicted, what would tell us if violated?

A

Looking out for a bow shape. Or just in general if it’s looking like the dots are curving away from the diagonal line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How do we fix when linearity appears to be violated due to bow shape?

A
  • By applying a NONLINEAR transformation to variables
  • By adding another regressor that is a nonlinear function - polynomial curve
  • Examine the moderators
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

So now that we have looked at additivity and linearity, what is normality when it comes to parametric test assumptions?

A

Not about data being normally distributed only.
But, the normal distribution is relevant to:

  • Parameters (sampling distribution)
  • Residuals / Error Terms (confidence intervals around a parameter or null hypothesis significant testing)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Why, when looking at the assumption of normality, is it not enough to say the data is normally distributed so that’s fin?

A

Because the CLT says as the SAMPLE size gets closer to positive infinity (larger) then the sampling distribution, NOT the data, approaches normality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What does the central limit theorem say and how does this influence how we interpret normality for the parametric test assumption?

A

As the sample size increases towards infinity, the sampling distribution approaches normal. There is an equal probability of selecting a value 0 to 1, therefore it’s uniform.

In bold: The CLT says the means are normally distributed .

So, the means were calculated using data from a uniform distribution, but the means themselves are NOT uniformly distributed. Instead, the means are NORMALLY distributed.

If you collect samples from distributions, whatever types, the means will be normally distributed. CLT says who cares where your data comes from.

The sample means will always be normally distributed. So we don’t need to worry about distribution. That’s why we look at normality in a different way for this assumption of normality for parametric tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What can the sample means collected from a data sets be for?

A
  • Make confidence intervals
  • Do T tests that ask if there is a difference between the means from two samples
  • ANOVA where we ask if there is a difference among the means of three or more samples
  • and any other statistical test that uses a sample mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

True or false: Samples means will be normally distributed always?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

For the central limit theorem the sample size needs to be at least 30: True or False?

A

False. This is a rule of thumb that is generally considered safe but you can break the rule - Michelle used a sample size of 20 once.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is the fine print for the CLT?

A

In order for it to work at all, you have to be able to calculate a mean from your sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

True or false: even if data itself is not normally distributed, the means from the sampling distribution are normally distributed

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

True or false: the individual distribution of the data needs to be normally distributed according to the CLT

A

FALSE. The sampling distribution will approach normality even if Y not normally distributed

43
Q

If data is skewed off to the right, Is this negative or positive?

A

Negative

44
Q

If the normal distribution appears quite flat, so has half the height, what type of kurtosis is this?

A

Negative kurtosis

45
Q

If the normal distribution appears quite tall , so is about a third taller than usual what type of kurtosis is this?

A

Positive kurtosis

46
Q

If a normal distribution is looking kind of fat, what would it be named? (three options)

A

Either Leptokurtic, Mesokurtic, Platykurtic

47
Q

What is Leptokurtic?

A

When the data is looking tall AND fat. Heavy

48
Q

What is Mesokurtic?

A

When the data is looking normal type of height but FAT

49
Q

What is Platykurtic?

A

When the data is looking short and fat and like a platypus could fit under it. Light

50
Q

What is kurtosis?

A

The “heaviness” of the tail

51
Q

What is skewness?

A

The symmetry of the distribution

52
Q

What are skewness and kurtosis both falling under?

A

Properties of frequency distributions

53
Q

What are we looking at when we’re talking about a distribution and graphing it with a histogram?

A

We are looking at the FREQUENCY of how often the data in that range occurs.

54
Q

Do we know the sampling distribution for our data?

A

No

55
Q

How do we test the data to establish where it meets normality assumption?

A
  • Check data or residuals using Q-Q plot of Histograms.
56
Q

What does the QQ plot assess?

A

It compares sample quantiles to quantiles of normal distribution.

It checks the data/errors. If points are mostly on a straight line, normality is better.

57
Q

So we can assess normality by visual inspection - but what about tests? Which ones can we use?

A

Shapiro Wilkes Test

Q-Q Plot

58
Q

What does a Shapiro Wilkes Test.. test?

A

If data differs from a normal distribution. It is testing against the null hypothesis that you DO have a normal distribution.

if its p < 0.5 that means data varies significantly from a normal distribution, therefore normality assumption is violated.

If p > 0.5. that means data is not statistically significant, therefore is does nOT vary significantly from a normal distribution, i.e normality is not violated. So we want to see it more than 0.5

59
Q

If we have a p less than 0.5, what is that saying about the data?

A

That data varies significantly from a normal distribution, better than at least 95% chance

60
Q

Okay, we have been through Additivity + Linearity, and Normality. What does homogeneity of variance assume?

A

That all groups or data points have the same or similar variance = the assumption of equal variances.

If they have equal variances, there is homoescedasticity! If they do not, they have heteroscedasticity.

61
Q

What is the variance from the regression line called?

A

The error, or the residual

62
Q

What does the variance, or the error term from the regression line tell us?

A

The error from what we predicted the y would be based on its x value, AND what we ACTUALLY observed from the true data/

So predicted versus error (residual)

63
Q

Is heteroscedasticity equal variance?

A

NO. That’s homoscedasticity

64
Q

What does the final, assumption of independence, tell us?

A
  • assumes that the residuals are unrelated (independent) of each other, which means mostly you don’t have repeated measures of data
  • typically assumed based on study design as difficult to assess without knowledge
  • IF OBSERVATIONS ARE NON INDEPENDENT (so data is correlated with each other as data points come from same person/unit we would see downwardly biased standard errors
65
Q

When looking at the assumption of independence regarding parametric tests, what would we see if observations are non independent?

A

That means data is correlated with each other so downwardly biased standard errors . So the observations rely on one another. It is ok If this is the case but need to do something different

66
Q

What is a univariate outlier?

A

An outlier when considering ONLY the distribution of the variable it belongs to

67
Q

What is a bivariate outlier

A

An outlier when considering the JOINT distribution of two variables

68
Q

What is a multivariate outlier?

A

Outliers when simultaneously considering multiple variables. Difficult to assess using numbers or graphs

69
Q

What type of outliers bias the mean and inflate the standard deviation?

A

Univariate outliers

70
Q

What is a regression coefficient?

A

Regression coefficients are estimates of the unknown population parameters and describe the relationship between a predictor variable and the response. In linear regression, coefficients are the values that multiply the predictor values. Suppose you have the following regression equation: y = 3X + 5. In this equation, +3 is the coefficient, X is the predictor, and +5 is the constant.

The sign of each coefficient indicates the direction of the relationship between a predictor variable and the response variable.

A positive sign indicates that as the predictor variable increases, the response variable also increases.
A negative sign indicates that as the predictor variable increases, the response variable decreases.
The coefficient value represents the mean change in the response given a one unit change in the predictor. For example, if a coefficient is +3, the mean response value increases by 3 for every one unit change in the predictor.

71
Q

What are one of the key visual differences between a univariate and a bivariate outlier?

A

Bivariate means it is BREAKING AWAY FROM THE PATTERN OF THE ASSOCIATION BETWEEN YOUR TWO VARIABLES. So what you would usually see is a clear straight line between variable A and B, through those data points, but the other data point would be way off.

72
Q

How do we deal with outliers?

A

Remove it, or trim the data

Transform the data

Change the score through winsorizing

73
Q

In winsorizing when dealing with outliers, what are the four things you can do?

A
  • Change the score to the next highest value plus some small number (eg 1 or whatever appropriate to data)
  • convert the score to that expected for a z score of +-3.29
  • convert the score to the mean plus 2 or 3 SD
  • convert the score to a percentile of the distribution (e.g 0.5th or 99.5th percent)
74
Q

What is winsorizing?

A

a predefined quantum of the smallest and/or the largest values are replaced by less extreme values. Thereby the substitute values are the most extreme retained values.

75
Q

Is winsorizing getting rid of a data point?

A

No. It is changing the value of the data point. Goal is to keep the data in without driving the effect

76
Q

Why would transform your data?

5 points

A
  • for convenience or ease of interpretation - standardisation, e.g z scores allow for simpler comparisons
  • Reducing skewness - help get closer to normaity assumption
  • equalising spread of improving homogeneity of variance - produce approximately equal spreads
  • linearising relationships between variables - to fit non-linear relationships into linear models
  • making relationships additive and therefore fulfilling assumptions for certain tests
77
Q

Transformations can be

A. Linear.
B. Non-linear
C. Linear and non-linear
D. None of the above

A

C

78
Q

Linear transformations..

A. Change the shape of the distribution
B. Change the shape of the distribution, can change the value of the mean/SD
C. Do not change the shape of the distribution and can change the value of the mean and/or SD.

A

C

79
Q

How do we make linear transformations?

A
  1. Adding a constant to each number e.g x + 1
  2. Converting raw scores to z-scores (x - m)/SD
  3. Mean centring (x - m)
80
Q

How do we make non-linear transformations?

A
  1. Log, log(x) or ln(x)
  2. Square root of x
  3. Reciprocal, 1/x
81
Q

True or false: non-linear transformations change the shape of the distribution

A

TRUE

82
Q

What non-linear transformation would we use to help with positive skew and stabilising variance?

A

Log because log in general is defined for positive values (can’t have negative values or zero in data set)

83
Q

True or false: Log transformations don’t work if negative values or zero in data set

A

TRUE. Can only have positive

84
Q

When making non-linear transformations, does square rooting work for negative values?

A

NO. Only zero or positive.

85
Q

When is non-linear reciprocal transformations able to be used?

A

Can reduce the impact of large scores and stabilise variance.

Defined for ZERO and POSITIVE values.

86
Q

Log transformations, square root and reciprocal can all be used for:
A. reducing positive skew
B. reducing positive skew and stabilising variance
C. reducing negative skew and stabilising variance
D. all of the above

A

A

87
Q

Why would we see difference in original and non-linear transformations in terms of how the distributions of data look?

A

Because non-linear transformations change the shape of the distribution

88
Q

What type of transformations can normalise a distribution?

A

Non-linear transformations only.

Log, square root, reciprocal

89
Q

True or false: non linear transformations change the data and therefore change the results

A

True

90
Q

What is the difference between linear and non linear transformations in terms of what happens when increasing the data by a certain value?

A

In linear transformations, might increase by 1 and this would mean all values go up by one. But in non linear, a 1 unit increase from 1 to 2 is a .693 increase, and an increase from 10 to 11 is .095. So 1 unit increase is no guarantees to mean true 1 value.

91
Q

Why is it important to make sure choosing most appropriate transformation - whether linear or non linear?

A

Because it can hinder rather than help if wrong one is applied. For example, not truly adding 1 unit and instead through log transformation you might be adding .693.

Transformations can also make interpretation more difficult

92
Q

When transforming an X variable into Log X, how should this new data be treated in terms of interpretation and presenting?

A

It should be referred to as the log variable and not referring to the original raw values. Only the log values now.

93
Q

Why is graphing important when checking assumptions?

A

Because you can visualise your distributions as well as identify any outliers.

The go to plot is a Histogram

94
Q

How many variables is a Histogram referring to?

A

ONE. Not talking about plotting one variable against another, like in a scatterplot.

95
Q

What does the Y axis on a histogram show?

A

How many people/the frequency/had each score (not another outcome variable)

96
Q

What does the X axis on histograms/boxplots show?

A

Values of the variable

97
Q

What is the middle line (two middle lines) in a box plot showing?

A

The median

98
Q

True or false: the box is made up of the IQR, starting from lower quartile (25th percentile) and going to upper quartile (75th percentile)

A

TRUE

99
Q

Can a boxplot show statistical significance?

A

no. But can see where distribution lies, range of interquartile, and total range on whiskers. and outliers

100
Q

BARPPLOTS need error bars. Why?

A

Error bars show uncertainty/error in the data, which tells about variability in data.

101
Q

What is the most common error bar?

A

Confidence intervals

102
Q

What is a 95% CI saying?

A

A 95% CI is the range where we are 95% that the range is likely to include any population value. So, it is a measure of error and uncertainty that the population values could still reasonably fit within that range, and that’s what you’re saying on the right hand side, the 95% CI go from 14 to 16 with a mean of 15. And again, those CI look like they overlap a lot between male participants and female participants.

103
Q

What do scatterplots show us?

A

Associations between two variables, so important to see if sig. relatinship, as scatterplot will show us that as data points increase in ONE< they either decrease or increase in another variable.