Module 4 Flashcards
w
what is a contingency table?
data frequencies or proportions within different levels of categorical variable.
What are one way and two way contingency tables?
- they jusr refer to the number of categorical variables you observe for each sampling unit
What are marginal dsitributions?
- one way to see overall patterns in the data
- claculate row and column frequencies
- the row and column sums of a two-way contingency table. They can be shown as frequencies or proportions.
How to find marginal distributions in rows vs columns?
rows: sum frequencies accross all columns for each row
column: sum frequencies accross all rows for each column
what are conditional distributions?
two-way tables that show the proportion of sampling units for one variable within each level of the second variable. the interaction between categorical variables (shown as seperate table)
How create conditional distribution?
select one of the categorical variables to be the primary variable and the other one to be the secondary (conditional) variable
How are conditional distributions calculated?
calculated as the frequency from contingency table divided by the marginal distribution of the primary variable
- identify primary and secondary variable
- for each cell in the new table, divide the value from the contingency table by the marginal distribution of the primary variable
What do contiditonal distributions show us in regards to the variables?
allow us to see how the secondary variable changes accross the primary variable
What is a bar graph?
used to visualize categorical data
vertical or horizontal orientation
What are two variable bar graphs?
- can be used to display data with two categotical measurement variables
- designate one variable as the grouping vairable (forms the base of the figure, and levels of the other variable are shown within each level)
- next step: do we create it as a grouped bar chart or a stacked bar chart?
What type of variable is good for a grouping variable?
ordinal categorical variables
What are grouped bar charts
- the second variables are shown beside each other within each level of the grouping variable
- levels of grouping variable are separated using a large gap
What is a stacked bar chart?
- levels of the second variable are stacked on top of one another within each level of the grouping variable
- just one bar for each level of the grouping variable (color used to sepearate)
What are histograms?
- visualize numerical data
- split numerical data into bins of equal size and display the number of sampling units in each bin
what are the three steps of how histograms are created?
- divide the numerical variable into a number of bins of equal size
- count how many sampling units fit within each bin (frequency)
- create a plot where each bin has abar with a height equal to the frequency of that bin, make sure no gaps between the bars