module 4 Flashcards
contingency tables
- shows frequency of sampling units
- tables of data frequencies within diff levels of categorical data
types of contingency tables
one-way and two-way tables
calculate marginal distributions as frequencies
- Row: sum frequencies across all columns for each row
- Column: sum frequencies across all rows for each column
calculate marginal distributions as proportions
- Table total: sum all frequencies in the table
- Row: sum frequencies across all columns for each row and divide by table total
- Column: sum frequencies across all rows for each column and divide by table total
marginal distribution
- row and column sums of a two-way contingency table.
- can be shown as frequencies or proportions
what do marginal distributions show
- how many sampling units are in each level of a categorical variable w out the need of other categorical variables
- they describe overall patterns in the sample
conditional distributions
- two-way tables that show the proportion of sampling units for one variable within each level of the second variable
- shows the relationship between two variables
- shown as a separate table
how do you create a conditional distribution table
- calculated from contingency table and marginal distribution
- select one of the categorical variables to be primary and one to be secondary (aka conditional)
- take the frequency from the contingency table and divide it by the marginal distribution of the primary variable
- basically: take the value and divide it by the sum of the row/column
how do you choose primary and secondary variables in calculating a conditional distribution
- depends on the question being asked
- ex. are there more _____(primary) than _____ (primary) in the _____(secondary) category? or how many ppl like _____(secondary) when doing _____(primary)
what do the primary and secondary variables in calculating conditional distributions determine
- if you use the row or column marginal distribution
what do conditional distributions show
- relative frequency of secondary variables within each level of the primary variable
- shows how the secondary variable changes across the primary variable
t or f: bar graphs are only used to visualize single variable categorical data
false, single and two variable categorical data
t or f: bar graphs are good at visualizing numerical data
- false, only acceptable in one case as it only shows average numerical value
- acceptable: stat datasets have categorical info on many sampling units, data is not statistical in nature
- not acceptable: if data is from a statistical population w one numerical and one categorical value
t or f: bar graphs can be horizontal or vertical
true, choice depends on focus of research question with more relevant info on the horizontal axis
how do you display data w two categorical measurement variables in a bar graph
- designate one variable as grouping variable (base of the figure, level of other variable are shown within it) it is whichever variable shows the info more clearly
- decide whether to create the figure as a grouped or stacked bar graph
what are the two types of two variable bar graphs
- grouped: variables are separate but shown beside each other in groups for each variable on the x axis
- stacked: variables are stacked on top of each other, just one bar per variable on the x axis