Module 5 Practice Questions Flashcards

1
Q

What is data fishiness?

A

Properties of data or statistical tests that suggest potential problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three assumptions to evaluate?

A
  1. Normality
  2. Homogeneity of variance
  3. Independence of observations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Using an NHST approach, what are the statistical tests that can assess normality?

A

Kolmogorov-Smirnov test
Shapiro-Wilk test

These tests compare our distribution of data to a normal distribution of data. If data does not differ from normal distribution than null is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Using an NHST approach, what are the descriptive statistics that can assess normality?

A

Skew
- tells us how asymmetrical our distribution is

Kurtosis

  • tells us the prevalence of extreme scores in tails
  • a certain number of extreme scores is normal
  • too much = positive kurtosis, heavy tails
  • too little = negative kurtosis, light tails
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the limitations of statistical tests of the assumption of normality? (There are three)

A

Role of sample size

  • tolerates violations of normality in small sample sizes (this is because power is low, fail to detect even big violations)
  • very sensitive to violations of normality in large sample sizes (detect even the slightest violations)

Logic of test is flawed

  • We should not be asking if the deviation from normality is 0 BUT instead is the deviation from normality large
  • This is because it is unlikely to be perfectly 0

Does not take into account type of non-normality
- Different types of deviation will be more problematic than others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which is worse, positive or negative kurtosis?

A

Positive kurtosis tends to be more problematic than negative kurtosis since it produces larger distortions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Using a graphical approach, what are the two visual displays that can assess normality?

A

q-q and p-p plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a problem with using the visual display approach?

A

Element of subjectivity

  • Easy to judge good versus bad
  • Difficult to judge in ambiguous situations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

____________ approach makes more sense than ________ approach

A

descriptive; NHST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Using an NHST approach, what are the statistical tests that can assess homogeneity of variance

A

Levene’s test
Hartley’s variance-ratio test
F-max test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Similar to the assumption of normality, what are some drawbacks to these statistical approaches of the assumption of homogeneity of variance?

A

Role of sample size (same as normality)

Logic is not right - should be asking if there is a big enough difference (same as normality)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the descriptive approach to testing the assumption of homogeneity of variance and provide examples

A

Take largest variance and smallest variance and compute ratio

3: 1 has been advocated as a threshold
- this means that as long as largest variance is not 3 times bigger than smallest variance, test is OK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the difference between a qq plot and a normal qq plot? What are they each used for? What does it mean if the scatterplot line is straight with a slope of 1? For qq plots what should the intercept be equal to? What does it mean when the slope of the line is different from 1?

A

Normal q-q plot

  • Scatterplot where your dataset is on Y axis (DV) and generate a normal distribution of data for x axis (IV)
  • If data are normal dots should cluster together in a straight line
  • A graphical display used to assess normality

q-q plot

  • One condition is on the x axis the other on y axis
  • Intercept should be equal to difference between two means
  • Slope of 1 indicates variances are equal to one another and data points cluster around straight line
  • Slope different from 1 indicates unequal variance and data points represent a cloud all very spaced out
  • A graphical display used to assess homogeneity of variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the results of positive and negative correlations among data points?

A

Positive correlation among data points = inflated alpha rates

Negative correlation among data points = inflated beta rates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do we evaluate the assumption of independence of observations?

A

Interclass correlation

  • A value of 0 indicates independence exists
  • A value that is not 0 indicates a violation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the four methods for addressing violations of each assumption?

A
  1. Use alternative statistical procedures that don’t require the specific assumption
    - common across all 3 assumptions
  2. Transform data to normalize the distribution
    - unique to assumption of normality
  3. Identify and remove outliers
    - common across assumption of normality and homogeneity of variance
  4. Evaluate level of measurement
    - common across assumption of normality and homogeneity of variance
17
Q

What is an outlier? What are they often responsible for?

A

Outliers are extreme scores in dataset

They are often responsible for violations of normality and homogeneity of variance

18
Q

What are some quick ways to weed out outliers?

A

Generate histograms
Generate normal q-q plots
Compute standardized residuals or studentized deleted residuals

19
Q

What do tails look like in q-q plots as a result of outliers?

A

Steep (aka thick) tails indicate more extreme values than what is acceptable

20
Q

What are the two primary approaches to dealing with/ responding to outliers?

A

They are “trimmed” or “capped” to most extreme acceptable value
- reduces disproportionate impact of observation

They are treated as missing data
- eliminates impact of observation all together

21
Q

There are 3 perspectives for dealing with outliers

A

Minimalist perspective

  • Data set should be minimally altered
  • Distributions should have some extreme values
  • Getting rid of outliers or altering them can create its own distortions

Maximalist perspective

  • Routine to alter or delete outliers
  • Hard to interpret results with outliers
  • Outliers create violations of assumptions

Intermediate perspective
- Justifiable with clear rules and procedure as well as high thresholds for outlier status

22
Q

What are the 4 levels of measurement?

A

Nominal, ordinal, interval, ratio

23
Q

Which levels of measurement are appropriate for t-tests and ANOVAs?

A

Has been argued that t-tests and ANOVA are only meaningful when DV has at least an interval level of measurement

24
Q

How do rating scales fit into this? What are the basic guidelines for dealing with quasi-interval data?

A

Some data can be ambiguous with respect to level of measurement (i.e., are traditional 5-point or 7-point rating scales ordinal or interval)?

Thus, rating scales have been deemed “quasi-interval”

Standard statistical procedures can function reasonably well for quasi-interval data if a sufficient number of response categories are provided and distributional assumptions are reasonably well satisfied (less than 5 points is problematic, 5 points is ambiguous, 7 points is sufficient)