Wk 3 - Research Questions for Associations - Contingency Tables and Correlation Flashcards
What is a confidence interval?
A confidence interval defines a range of plausible values for the unknown population parameter that we are interested in making inferences about from our single observed sample statistic.
What data is required to construct a confidence interval?
- The observed test statistic
- The standard error for the sample statistic
- The critical test statistic (defined by alpha)
What is the level of confidence?
100 x (1 - alpha)
So with an alpha level of 0.05, the CI will be 95%
What does a confidence interval allow us to do?
Make inferences about a population parameter based on a sample statistic.
What does a 95% confidence interval tell us about the corresponding population parameter?
That we can be 95% confident that the range of values for the population parameter that corresponds to an observed sample statistic, will be between the upper and lower bound of the CI.
When does the value of p increase in a p-value function plot (when conducting multiple null hypothesis tests)?
The closer to the mean, the larger the p value. The further from the mean, the smaller the p value (and more likely p will be less than 0.05)
Why is a confidence interval more informative than a null hypothesis significance test?
Because it defines a range of plausible values for the population parameter we are interested in, rather than just a single value (as in NHST).
What can’t a p-value from a NHST tell us?
Can’t tell us about the range of values for the corresponding population parameter. The p value is only indirectly relevant when making inferences about the population, while the CI is directly relevant.
Is a confidence interval an expression of probability?
No! Doesn’t tell us that there’s a 95% chance a given population parameter will be between the upper and lower bound of CI.
Only in the long run, over repeated applications will 95% of CIs contain the population parameter.
Is a confidence interval an expression of probability?
No! Doesn’t tell us that there’s a 95% chance a given population parameter will be between the upper and lower bound of CI.
What kind of statistical techniques are used for ASSOCIATIONS?
- For categorical data, a contingency table, chi-squared and odds ratio
- For continuous data, correlation.
What kind of association is investigated using correlation?
Associations between continuous variables
What does an association imply?
A relationship between variables (a systematic co-occurence between two variables)
What is meant by categorical data?
Nominal or ordinal scales of measurement. Use of numbers to label categories is arbitrary.
What is meant by continuous data?
Values imply some sort of meaningful order. Low scores imply that a person has less of a construct (and vice versa for high scores)
What are the four components of a research question?
- A question mark
- Type of association
- identifying relevant population
- Defining and measuring constructs
What is chi-squared used to measure?
An association between categorical variables.
What is correlation used to measure?
An association between continuous variables.
What is meant by a contingency between two variables?
An association in which there is a dependency between the frequencies of one categorical variable and the frequencies of the other.
What is an independent relationship between variables?
No association.
What is a dependent relationship between variables?
An association in which the frequencies in one category co-occur with frequencies in another category.
What is a Chi Squared test
A null hypothesis significance test between frequencies in two categorical variables.
What does the null hypothesis for a chi-squared NHST say?
That there is no relationship between variables. That observed and expected frequencies are the same.
What does Chi Square statistic measure
The difference between observed & expected frequencies.
What is the formula for calculating chi squared?
The sum of (observed - expected scores)^2/ expected scores
How are expected frequencies determined?
row marginal frequency x column marginal frequency/
What is the effect of large effects and large sample sizes on Tobs?
The observed test statistic will be larger when the effect is bigger and/or the sample size is bigger (pref. both)
What is an Odds Ratio?
The probability of an event occurring in one variable relative to the probability of an second and different category occurring in a different variable.
What are odds?
The probability of some event occurring relative to it not occurring.
What are the odds of an event occurring?
P/ (1 - P)