Lecture 6 Flashcards

1
Q

association = ?

A

patterned/systematic variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

example of an association?

A

association: gender is linked to patterns of work hours

no association: gender is not linked to patterns of work hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

hypothesis = ?

A

informed speculation to be tested

could be a possible relationship between 2 or more variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the two options with regards to systematic variation?

A

there is systematic variation

there is not systematic variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the tools for demonstrating association for discrete variables?

A

descriptive statistics (contingency tables)

inferential statistics (chi-squared test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

alpha = ?

A

the threshold for statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the necessary steps to establishing causality?

A
  1. theoretical rationale linking potential cause and effect
  2. demonstrate that the cause happened before the outcome
  3. show that there’s an association between the cause and the outcome
  4. remove any other factors that could be related to the outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what type of variables are the cause and the outcomes?

A

cause = independent variable

effect = dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the link between dependent and independent variables?

A

dependent variables are dependent on the independent variables

dependent variables are the effect and independent variables are the cause

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are two types of distributions?

A

marginal distribution
conditional distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

marginal distribution = ?

A

focuses on one variable at a time (univariate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

conditional distribution = ?

A

the distribution of one variable given something is true about the other variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how are marginal and conditional distributions represented?

A

can be represented as counts or percentages

standard practice for conditional distribution is to present it as percentages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

weird samples = ?

A

samples are quite different compared to underlying population

samples aren’t representative of the wider population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are the two types of hypotheses?

A

null hypothesis (H0)

alternative hypothesis (Ha)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

x squared = ?

A

chi-squared

17
Q

what is chi-squared all about?

A

comparing two differences

  1. differences we observe in our sample
  2. differences we expect if null hypothesis were to be true
18
Q

what is the chi-square test?

A

a hypothesis test used when you want to determine whether there’s a relationship between two categorical variables

19
Q

categorical variables = ?

A

nominal or ordinal level data that can be grouped into categories

20
Q

what is the significance level threshold?

A

5%

for p values smaller than 0.05, the null hypothesis is rejected

if p value is higher than 0.05, the null hypothesis is accepted

21
Q

how do you calculate the chi-square test?

A

you need the observed frequencies and the expected frequencies

(observed frequency - expected frequency) squared / expected frequency

calculate this for all cells and sum it up to get your chi-square value

22
Q

how do you calculate expected cell count?

A

(row total * column total) / grand total

23
Q

p-value < 0.001 = ?

A

there is less than a 0.1% chance of obtaining your value of chi-squared or larger if there’s really no relationship between these variables in the population

24
Q

what does a 5% threshold indicate?

A

that the null hypothesis is rejected

25
Q

what’s the purpose of p values?

A

p value helps us weigh the inference value of descriptive evidence

always a number between 0 and 1

tells us how probable the null hypothesis is

a high p value means that the sample data are compatible with true null hypothesis

a low p value means that the sample data aren’t compatible with true null hypothesis

26
Q

how are p values used?

A
  • we get a p value
  • we compare the p value to our a (alpha) significance level (e.g., 0.05 / 5%)
  • if the p-value < a, then we reject the null hypothesis
  • if the p-value > a, then we fail to reject the null hypothesis
27
Q

what are some chi-squared assumptions?

A
  • the chi squared test doesn’t work well with very small samples
  • expected cell counts need to be at least 1
  • no more than 20% of the expected cell counts should be less than 5