Chi-square and t-tests (wk 4) Flashcards

1
Q

What is a chi-square test?

A

Chi-square test/ square is a test of difference among categorical (nominal/ordinal) variables. There are two types: goodness-of-fit and test of association (or test of independence).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the chi-square goodness-of-fit test:

A

-Proportions with more than two levels
-How the proportions in data fit to fixed (expected) proportions. While the binomial test is limited to dichotomous variables (heads/ tails/ successes/ fail), chi-square tests can test more than two categories.
-Benford’s law -> The frequency of first digits of naturally occurring numerical data (prices, populations, lengths and etc) follow a particular proportion. Chi-square test for Benford’s law tests whether the frequencies of first-digits of the data follow the known proportion. If Benford’s law is preserved, the numbers are naturally occurring. If it is rejected, it’s likely that the data set is fabricated.
-Reporting test/ outcome -> The x2 value for df (degree-of-freedom) followed by p-value, normally, bigger x2 means bigger difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the chi-square test of association and McNemar’s test:

A

-Comparing proportions across two or more groups (test of association)
-Test of association is how proportions of two data sets are associated.
-Checking association between two nominal/ordinal values e.g. whether the proportion of tories/labours differ depending on the region of the UK.
-Descriptive statistics for chi-square test of association can be summarised as a contingency table.
-Reporting test/ outcome -> Typically the test result is reported by Chi-square value with df and N (number of samples), followed by p-value.
-McNemar’s test -> Paired samples mean that data points are paired across two groups. McNemar’s test is only available for two dichotomous variables (i.e. 2-by-2 contingency table).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a t-test and the 3 types of t-test:

A

+T-test -> Difference in a group of measures (interval or ratio variables). Compare means of populations (there of more means we use a different test). Null hypothesis is that means are equal.
-Three types of t-test, each corresponds to the test for nominal/ordinal variables that we already learned:
1. One sample t-test ~ binomial or chi-square goodness of fit
2. Independent (unpaired) samples t-test ~ chi-square test of association
3. Paired samples t-test ~McNemar’s test
-For each t-test, you can decide whether to do a one-tailed or two-tailed test, just like the binomial test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a one-sample t-test:

A

-Compares the mean of one sample group against a fixed value
-No significant difference in score -> any difference is due to sampling error
-Significant difference in score -> any difference is not due to sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a two-sample test:

A

-Comparing a measure across two groups -> independent
-Compares the observed difference between the means of two independent samples or categories. Because the data is from different groups, we say that it is independent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a paired t-test?

A

-Comparing a measure across two groups -> paired
-Compares the main difference of one group measure on two occasions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the assumption of normality for t-tests?

A

-Normality -> Sampling distribution of the mean is normal – if you take groups of n-samples from the distribution and calculate the means of each sample group, those means are normally distributed. This holds when the sample size n is sufficiently large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the stats tests assumptions for t-tests?

A

-Statistical tests based on the normality assumption are called parametric tests where normality should not always be assumed. The normality assumption (e.g. Shapiro-Wilk test) violation of the normality is indicated by low p-value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the significances of differences in variances as reported for p-values?

A

-Significances of differences in variance are reported as p-value:
1. p <0.05 -> variance not equal
2. p > 0.05 -> variance are equal
+ If variances aren’t equal, the Welch t-test can be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is t-statistic?

A

-T-statistic -> T-tests are based on t-statistic. The variable t is similar to the z-score, but it is about the mean and SD of the sample, not the population. T value depends on the degree of freedom = sample size – number of groups. Normally, greater t-value means greater difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly