Exam 2 - Oct 28 Flashcards

1
Q

What type of test should be run on a data set with one categorical explanatory variable, two independent groups, and met assumptions?

A

Two-Sample T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of test should be run on a data set with one categorical explanatory variable, two dependent groups, and met assumptions?

A

(1-Sample) Paired T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What should be done with the results from a test ran on a data set with one categorical explanatory variable and more than two groups (met assumptions)?

A

If fail to reject null → stop

If reject null → Perform Pairwise Comparisons using Tukey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What type of test should be run on a data set with one categorical explanatory variable, two dependent groups, and unmet assumptions?

A

Wilcoxon Signed-Rank Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of test should be run on a data set with no explanatory variable?

A

One-Sample T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of test should be run on a data set with one categorical explanatory variable, two independent groups, and unmet assumptions?

A

Mann-Whitney

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What type of test should be run on a data set with one categorical explanatory variable and more than two independent groups? (General)

A

1-Way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What should be done with a data set with one categorical explanatory variable and more than two groups that does meet assumptions?

A

Global F-Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What should be done with a data set with one categorical explanatory variable and more than two groups that does not meet the normality and/or constant variance assumptions?

A

Transform Response Variable Ln, Box-Cox, etc If assumptions still not met, use alternative analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the hypotheses for the Tukey Pairwise Comparisons?

A

Ho: μi -μj = 0 or Ha: μi - μj (≠, less than, greater than) 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the hypotheses for the Mann-Whitney test?

A

Ho: Median1 - Median2 = 0 or Ha: Median1 - Median2 (≠, less than, greater than) 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the hypotheses for the (1-Sample) Paired T?

A

Ho: μDiff = 0 or Ha: μDiff (≠, less than, greater than) 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the hypotheses for the 2-Sample T?

A

Ho: μ1 - μ2 = 0 or Ha: μ1 - μ2 (≠, less than, greater than) 0
Also Need to Determine = or ≠ Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the hypotheses for the 1-Sample T?

A

Ho: μ = μ0 or Ha: μ (≠, less than, greater than) μ0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the hypotheses for the ANOVA Global F-Test?

A

Ho: μ1 = μ2 = … μI or Ha: Not all μI are equal

Check R2, Pooled Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the hypotheses for the Wilcoxon Signed-Rank test?

A

Ho: MedianDIff = 0 or Ha: MedianDiff(≠ , less than, greater than) 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Null Hypothesis (Ho)

A

the initial assumption that is assumed to be true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Alternative Hypothesis (Ha)

A

an assertion contrary to Ho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Test Statistic

A

a numerical summary of a dataset that reduces the data to one value that can be used to perform a hypothesis test OR
a statistic calculated from the data that assumes the null distribution is true and measures the plausibility of alternatives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Rejection Region

A

set of all test statistic values for which Ho is rejected, based on 𝞪

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

P-Value

A

the probability of observing a test statistic as extreme as observed by random chance if the null hypothesis is true OR
area under the t-distribution curve not included by the test statistic, closer to 1 is lack of evidence for Ha = higher chance of results being due to random chance alone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

One-Sided Hypothesis Test, hypotheses and p-value

A

less than or greater than

P-value describes the area under the t-distribution curve to the right or left of the test statistic value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Two-Sided Hypothesis Test, hypotheses and p-value

A

not equal to

P-value is the area in both tails of the t-distribution further out than the absolute value of the test statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Meaning and chance of type I error

A

False positive, null hypothesis is rejected even though it is actually true
Probability of error occurring is < 𝞪

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Meaning and chance of type II error

A

False negative, failure to reject incorrect null hypothesis

Probability of error occurring is 𝜷

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How does changing 𝞪 change the chances that the researcher will make a type I or II error?

A

Decreasing 𝞪 lowers the probability of a type I error but raises the probability of a type II error

27
Q

How does changing the sample size change the chances that the researcher will make a type I or II error?

A

Increasing the sample size decreases the probability of either

28
Q

Meaning and calculation for power

A

Probability to correctly reject a false null hypothesis or discover a true positive
1 - 𝜷, where 𝜷 is the chance of a type II error

29
Q

How does the significance level and sample size affect power?

A

Raised by increasing 𝞪 or sample size

30
Q

Calculation for confidence level

31
Q

What are some positive things an increased sample size can do for a study?

A

decrease the width of confidence intervals, allow for the application of the CLT, and increase power while decreasing the probability of making a Type I or Type II error

32
Q

In general, how do you determine the sample size for a 2-Sample Independent T-Test?

A

Determined by effect size relative to known SD, 𝞪, and power

Can be lowered by increasing effect size or using a lower level of power

33
Q

What is Effect Size?

A

Magnitude of effect/signal (△) relative to noise (σ)

△/σ or absolute value of μ2-μ1 over σ

34
Q

What are the assumptions for a One-Sample T?

A

Sample is randomly drawn from a normally distributed population (or CLT)
Observations in sample are independent

35
Q

What are the assumptions for a Two-Sample T?

A

Samples are randomly drawn from normally distributed populations (or CLT)
Samples are independent
Observations within each sample are independent
Standard deviations are the same for the two populations

36
Q

What are some tests for normality?

A

Histogram - Plot the data into a histogram and superimpose a normal curve
Normal Curve - Compare data with 68-95-99.7 rules
Probability Tables - Comparison of observed versus expected left tail percentages (Anderson-Darling P-value should be large → lack of evidence for non-normality)

37
Q

What are some tests for independence?

A

Look for cluster effects due to responses being collected from a specific subgroup and serial effects from responses being collected with similar time/space
Residuals vs Time/Order Plot (Sequence Plot)

38
Q

How can you determine if the standard deviations for two groups are similar enough to use pooled variance when conducting a Two-Sample T?

A

Sp assumes common SD → If the ratio S1/S2 is under 2, (S1 = larger SD), we can use the pooled procedure (assuming sample sizes are roughly equal), otherwise use the procedure that allows unequal standard deviation

39
Q

Definition of robust

A

statistical procedure is valid even when assumption(s) is not met

40
Q

Definition of valid

A

Confidence levels and p-value are almost equal stated rates

41
Q

What are the usual cutoffs for sample size and normality?

A

N < 15 → use t-procedures only with normal distribution
15 < N < 40 → use t-procedures with no outliers or strong skewness
N > 40 → use t-procedures with no outliers

42
Q

What is the procedure that should be followed in the event of extreme outliers?

A

Including/Excluding = Identical Results → Include outliers
Check for Errors
Check for Subpopulations → Exclude and report reason
Use resistant analysis or report both inclusion and exclusion results

43
Q

Describe the basics of the Mann-Whitney test. What is it resistant to? What does it replace?

A

Resistant to outliers, nonnormality, and censored observations

Order observations sum of ranks for one group is test-statistic, P-value uses the average and sample standard deviation of ranks

Based on medians, replacement for 2-Sample T

44
Q

What are the assumptions for the (1-Sample) Paired T-test?

A

Differences are a random sample from a normally distributed population (or CLT)
Differences between individuals are independent

45
Q

Describe the basics of the Sign test. What is it resistant to? What does it replace?

A

Resistant to outliers, but does not use all data and may be inconclusive

Calculate number of pairs where one measurement exceeds the other, p-value is from the comparison to n/2

Based on medians, replacement for (1-Sample) Paired T

46
Q

Describe the basics of the Wilcoxon Signed-Rank Test. What is it resistant to? What does it replace?

A

Resistant to outliers and more powerful, requires more computation

Order absolute differences between pairs and assign ranks, calculate the sum of ranks for the pairs were the difference is positive

Based on medians, replacement for (1-Sample) Paired T

47
Q

What is the linear model formula for a 1-Way ANOVA?

A

Yij = μi +εij → Observation = Group Mean + Error

I is group number, J is observation number

48
Q

Which components of the 1-Way ANOVA table follow the Between + Within = Total rule?

49
Q

Describe the F distribution (1-Way ANOVA)

A

F-distribution starts at 0, different F-distribution for each pair of df’s
P-Value is right tail of F-distribution

50
Q

What does the R-Sq or R^2 value in a 1-Way ANOVA table mean? How is it calculated?

A

Estimates the strength of the relationship between model and response variable, percent of variation explained by group variation

SS(between) / SS(total)

51
Q

What are the assumptions for the 1-Way ANOVA?

A

Each group is a random sample from a different normally distributed population
The population standard deviations are all the same
Observations are independent

52
Q

What is the symbolic model for the 1-Way ANOVA?

A

εij ~IID N (0,σ^2)
IID → independent and identically distributed
~ N → follows a normal distribution
(0,σ^2) → mean of zero and variance of σ^2

53
Q

How can you check the 1-Way ANOVA assumptions?

A

Normality → Normal Probability Plot of Residuals (Anderson-Darling P-value should be large → lack of evidence for non-normality)

Population SD → Residuals vs Group Average Plot
* Check for an increase in spread with larger groups (Levine’s Test P-value should be large → lack of evidence for non-normality)

Independence → Residuals vs Time/Order (Sequence plot), should be no pattern

54
Q

What could indicate a need for a Ln transformation on 1-Way ANOVAs? What changes need to be made for the reported results?

A

If larger means lead to larger variances for Residuals vs Group Average may indicate a need for a Ln(Y) transformation

Report estimated differences and confidence intervals as — —- in Ln(response variable unit) is # units greater

Or report as X times greater, where X is e^(natural log units different) and confidence intervals in X times greater from e^(natural log units lower bound) to e^(natural log units upper bound)

Note in methods that a ln transformation was used

55
Q

What type of unmet assumptions is the 1-Way ANOVA resistant to?

A

Mild deviations from normality with large group sizes
Equal variance and independence are critical
Not resistant to extreme outliers or very skewed distributions with different/small sample sizes

56
Q

How do you interpret a Box-Cox procedure output value?

A

Lambda of 1 is no transformation, lambda of 0 is natural log, other lambdas are Y^(lambda)

57
Q

Between Tukey and Bonferroni, which is more conservative? More powerful? How do they deal with confidence levels?

A

Bonferroni is more conservative, Tukey is more powerful

The Bonferroni 𝞪 may be much lower than stated, Tukey specifies an exact family significance level for comparing all paris

58
Q

Define familywise confidence levels (1-Way ANOVA, pairwise comparisons)

A

Success rate of a family of confidence intervals / hypothesis tests

Controlled when using pairwise comparison procedures

59
Q

When should the Bonferroni pairwise comparison method be used?

A

Small number of planned comparisons

60
Q

When can you use a confidence interval instead of a t-test?

A

The interval and test are two-sided and the alpha’s match

61
Q

In what general scenarios might nonparametric tests be used?

A

Outliers, Non-Normality, Censored Data

62
Q

What will be the typical changes in T-statistic and P-value if a 2-Sample T test is used when a Paired T test should have been used?

A

Lower t-statistic and higher p-value

63
Q

What is the difference between the Sign test and the Wilcoxon Signed-Rank test?

A

The sign test uses only sign/direction

Wilcoxon uses sign and magnitude(rank)

64
Q

What part of a confidence interval equation is the margin of error?

A

Estimate +/- Critical-value * SE

MOE is Critical-value * SE