Exam 2 Two-Way tables Flashcards
Bias
The design of a statistical study is biased if it is systematically favors certain outcomes.
Association
Values of one variable tend to occur with certain values of another variable; detected when the conditional distributions differ from the marginal distributions and from each other.
Bivariate data
Data collected on two variables for each individual in a study.
The only arithmetic operation that makes sense for categorical data is
Counting
Two-Way table is a
Table giving counts (or %) for two categorical variables
In two-way tables the ….. Is is a row variable, and …… Is the column variable
Explanatory ….response
Marginal distribution
Distribution of column variable separately ( or row variable separately ) expressed in counts or percent
Both margins sum to the
Same overall total
Conditional Distributions
Distribution of percents under a specific condition ( either in row or column)
There is no association between categorical variables when all
Conditional distributions look like the marginal distribution and each other
There is a potential association between categorical variables when
One or more of the conditional distributions look very different from the marginal distributions or each other
Simpson’s paradox happens
When the third lurking variable is associated with the other two.
What is Simpson’s paradox
A situation where the relationship between two variables is in one direction when the third lurking variable is included but in the opposite direction when the the third lurking variable is ignored.
Measures affected by outliers
Mean Standard deviation Correlation R-sq Slope and y-intercept
Rules of Data Analysis
- Always plot your data
- Always describe shape, center, spread of distributions
- Don’t use the normal distribution to mod data that is not normal
- Always draw and label the normal curve when finding % of x values
- For Bivariate data, always examine a Scatterplot before computing r or modeling with a least squares regression equation