Lectures 5-6: Central Limit, Confidence Intervals, Crosstabs, Chi-squared, Cramer's V, T & Z tests Flashcards

1
Q

What types of tests are T and Z tests?

A

Hypothesis tests. The value specified in the null hypothesis is taken as the benchmark.

A T test allows us to test whether a sample mean (of a normally-distributed interval variable) significantly/reliably differs from a hypothesised value, or is just due to chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

For what type of variables is the T test used?

A

Normally-distributed interval variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When should you use the T test as opposed to the Z test?

A

When you do not know the standard deviation of the population, or if the population size is small (under 30 observations).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does “statistically significant” mean?

A

It means that the trends the the sample are representative of trends in the population. In other words, your result is unlikely to have happened by chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does statistical significance depend on?

A

It depends on the association intensity (measured by Cramer’s V) and sample size.

The larger the sample, the better, however the strength of the association is the most important factor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are inferential statistics?

A

Statistics which describe our sample, but which also tell us what we can expect in new samples that we do NOT even have, allowing us to generalise our findings to a population. T tests are inferential statistics.

Descriptive statistics, by contrast, simply describe the data that you have.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In the context of the T test, what does the P value mean?

A

The P value gives the probability that the pattern of data in the sample could be produced by RANDOM data. It therefore gives the probability of rejecting the null hypothesis when it is in fact true (Type 1 error).

A P of .01 means that there is a 1% chance of getting the results with random data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a Paired T test?

A

A Paired T test is a test to compare the mean of one group twice.

Eg. To test the balance of a group of people before and after drinking alcohol.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an Independent T test?

A

An Independent T test compares the means of two independent groups.

Eg. To measure the cholesterol level of a group which has taken a medication versus a group which has taken the placebo. The groups are different, but the variable (cholesterol) is the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Bivariate describes?

A

Bivariate describes relationships between two variables. Eg. education & Income.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In order to run your variables independence test, when should you use a crosstabulation ?

A

Crosstabulation is used when both variables are categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Can you use a crosstabulation when the dependant variable is continuous and the independent variable is dichotomous?

A

No, crosstabulation is only for the case when both dependent and independent are categorical. Other tests comparing differences in means or differences in proportions are used when (D & I) variables are both not categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the Central Limit Theorem state?

A

The higher the number of samples, the closer the distribution of the means of those samples will draw to a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the Central Limit Theorem the mean of the sampling distribution is the population mean. True or False?

A

True. The mean of the sampling distribution is the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

According to the Central Limit Theorem when is the sampling distribution approximately normal?

A

The sampling distribution is approximately normal if n is high (>30) or if the population distribution is normal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In the Central Limit Theorem what’s the Standard Error?

A

The standard deviation of the sampling distribution indicates the range of possible error, also known as the Standard Error (SE).

17
Q

The larger the sample size, the greater the Standard Error. True or False.

A

False. As the sample size increases, the Standard Error decreases.

18
Q

What’s the relationship between the population standard deviation and the standard deviation of the sampling distribution (Standard Error)?

A

They are directly proportional.

19
Q

How do you calculate the degrees of freedom in cross-tabulations?

A

one less than the number of rows, multiplied by one less than the number of columns

df = (r – 1)(c – 1)

20
Q

What are cross-tabulations/ contingency tables?

A

Bivariate statistics for qualitative variables

Describe relationships between 2 variables (ex. Voter location and political choice)

Conventionally DV in columns and IV in rows.

21
Q

What is the X2 Test (basically the same as chi-squared test)?

A

Uses a sample to make an inference about a population. Involves classification of a independent variable into 2 or more categories (nominal data).

For example, when looking at voting results, were there any significant differences between voter location (city, town, rural) and political choice? Assesses data in cross-tabulations/ contingency tables.

22
Q

What is the X2 Test (basically the same as chi-squared test) testing?

What does a X2 value of 0 mean?

What does a high X2 value mean?

A

Test whether any difference between categories is statistically significant. Compare observed frequencies (fo) with those that would be expected if there were no relationship between the variables (fe).

If the observed and expected agree exactly, X2 = 0.

The greater the discrepancy between the observed and expected frequencies, the larger the X2 value (and thus the larger chance you will reject the null hypothesis).

23
Q

What are criteria for which X2 approximates the chi-square distribution?

A

The sampling distributed X2 approximates the chi-square distribution very closely, provided there are more than 5 units per cell in the contingency table.

If too small, use “Collapse” or “Fisher’s Exact Test”.

24
Q

What are the hypotheses in the chi-squared test?

A

Null hypothesis (H0): no difference in effect of IV categories on DV. For example, there is no difference between political choice with respect to voter location

Alternative hypothesis (Ha) is mutually exclusive with H0: there is a difference in effect of IV categories on DV. For example, people in cities are more likely to vote left-wing.

25
Q

What do the alpha levels mean in the chi-squared test?

A

(interpretation of the p-value)

α level = degree of certainty

  1. 1 = acceptable in some cases, not in scientific journals
  2. 05 = the current norm in the social sciences
  3. 01 and 0.001 = high statistical significance
26
Q

How do you calculate the X2/chi-squared test value?

A
  • Adopt a significance level (5% for example)
  • Calculate X2, then compare this number with the number defined in the table for the correct df and 5% significance level
27
Q

If your X2/chi-squared test produces results that are statistically significant, what do you still want to investigate?

A

The strength of the association. Cramer’s V!

28
Q

What if your chi-square result is larger than the value in the table?

A

Reject the null hypothesis

29
Q

What if your chi-square result is smaller than the value in the table?

A

Null hypothesis cannot be rejected

30
Q

How do you know if your chi-square result is statistically significant?

A

As long as the calculated value falls within the range listed for different significance levels in the table, the result can be deemed to be statistically significant. A lower p-value indicates greater statistical significance.

It is relatively easy to get a statistically significant result with large samples.

31
Q

What is Cramer’s V used for? (What is the Cramer’s V method?)

A

Cramer’s V is one of the methods used to investigate bivariate statistics for QUALITATIVE variables

32
Q

What does Cramer’s V indicate/measure?

A

Cramer’s V measures the strength of association between two variables (used after Chi-square determines statistical significance)

33
Q

What is the range of the values taken by Cramer’s V?

A

Cramer’s V varies only between 0 and 1 (Minimum value is 0 and maximum value is 1)

34
Q

What does a value of 0 for Cramer’s V mean?

A

If Cramer’s V is equal to 0, it means there is no association between the two variables

35
Q

What is the measure of a strong association (in Social Science) when using Cramer’s V?

A

In the social sciences, an association is considered to be moderately strong if Cramer’s V is higher than 0.1

36
Q

How is Cramer’s V derived/defined?

A

How is Cramer’s V derived/defined?
Cramer’s V is derived using the Chi-square (X-square) value. It is (mathematically) defined as:

V = Square-root (Chi-square/theoretical maximum Chi-square)

Theoretical maximum Chi-square is equal to the product of the number of observations (n) and either the number of rows or the number of columns (the smaller of the two) minus 1 (please see Notes document for formula snapshot)