W3: RQ for Associations (Page 22-34) Flashcards
What is the margin of error in confidence intervals.
Are they symmetrical?
Margins of Error
- Lengths from sample estimates to lower bound and to upper bound.
- They can be asymmetrical
What is a two-way contingency table
Two-way contingency table:
Frequency counts of belong jointly to each category of one variable and each category of a second variable
What does a joint cell in a contingency table contain
Frequency count of all participants who belong to both variables a and b
What is the row and column variable on a contingency table called
Marginal Cells.
Note: Marginal Cells will add up to sample size
What can help us understand a two-way table visually
Mosaic Plots
What is a Cramer’s V? Include details such as range and what does it not do?
- Measure of strength of association in contingency table, where at least 1 variable has 3 + categories
- No direction (No intrsinic ordering to 3 or more categories)
- Range: 0 to 1
- Both a sample statistic and population parameter (Though population paramter is unknown, and hence a CI is calculated for the populaton parameter based on the sample statistic)
Does Cramer’s V use words like significant? What are some issues raised in the lecture?
- No use of words like significant
- Because significant has a strict statistical meaning
- No statment that says no association
- Sample data will never tell us the true state of affairs at the population level
- No P-value
- Cramar’s V does not provide, CI is more informative as it contains any information in a p-value
What is an estimator and an estimate
Estimator:
- Function applied to sample statistic to obtain an estimate for a population parameter (Estimate, not a calculation*)
- __Estimate
- Point: Sample statistic value
- CI: Interval estimator
- __Estimate
How many estimators are there for the same population parameter. Why are there different estimators?
- We can have different estimators for the same population parameter and obtain different values
- Point estimate: Different
- Interval estimate: Different
- They have different properties and based on different assumptions
How can estimators of a population parameter vary in 3 properties: What does a 95% unbiased interval estimator do?
Unbias
Captures the true population parameter value 95% of the time on average over the long run
- Biasness does not depend on sample size
How can estimators of a population parameter vary in 3 properties. What does a 95% consistent interval estimator do?
Consistency
- Increasing closer to capturing the true population parameter value 95% of the time on average over the long run as SAMPLE SIZE increase
- Consistency relates to sample size
How can estimators of a population parameter vary in 3 properties. What does a 95% efficient interval estimator do?
Efficiency
Produces a more narrow confidence interval on average over the long run compared to some competing estimator
what is the ideal interval estimator. What sometimes happens
Ideal Interval Estimator:
- Unbiased and very efficient
- Sometimes, a consistency + efficiency is preferred over unbiased + inefficient
What are examples of effect sizes? What is an effect size and how is it estimated? When is it more useful
- Quantitative measure of strength of relationsip between constructed measures.
- Estimated by sample statistics and can be applied to population parameters.
- It is more useful when accompanied by a CI.
- Correlations
- Cramer’s V
- Regression coefficients
- Means
- Mean differences
- Standardised mean differences
- R-Squared
- It is more useful when accompanied by a CI.
When both variables in a contingency table contain 2 categories, what is often used to report effect size
Odds Ratio
What is the odds ratio and what does it do. Range? Is it SS or PP
- Ratio of 2 odds formed by considering (a) odds of one category in a variable and (b) odds of one category in another variable
- Comparision of category in one variable relative to the 2 categories in the other variable
- Measure of strength of association between 2 variables both containing 2 categories (Only 2x2)
- Range: 0 to infinity
- 1 = No association
- 0 = Reduction in odds (neg corr)
- +infiity = Increase in odds (pos corr)
- Sample Statistic and Population Parameter
What are odds
Probability of one event occuring relative to probability of it not occuring
(p)/(1-p)
What happens to odds ratio if category order on one variable was reversed
The value will just be a reciprocal
When we have an odds ratio <1, what do we usually do
Reverse ordering of two categories in one of the variable in a contingency table and recalculate odds ratio (which will >1)
How will increasing sample size affect confidence intervals
CI will be narrower and more precise.
What is the difference between Cramer’s V compared to Chi-Square
- Chi-square only tell us direction/significance of association (not strength)
- Cramer’s V measures STRENGTH of association (not direction)
How do we know which interval estimator to use?
Context. Need a stimulation study,
Bias:
The mean of the sampling distribution is not equal to the population parameter value (irrespective of sample size)
What do we look at Mosaic Plots
Relative difference in heights of boxes.
Greater difference of same colour = Indicative of an association.
Consistency:
The mean of the sampling distribution gets closer to the true population parameter value as sample size gets larger.
Efficiency:
The average width of a confidence interval is smaller for more efficient interval estimator (for a given sample size and irrespective of whether the interval estimator is biased and/or consistent).
What is the limitation of a correlation matrix. What should we use instead? to get past this limitaiton
Might have too much detail to readily discern patterns. Use correlation plot for pattern identification instead