1.2-1.4 Categorical Variables Flashcards
What is the difference between Quantitative and Categorical data?
Quantitative data can only have numerical responses that would make sense to average or arrange in a certain order. Categorical data can be numbers as well, but only if the numbers act more like a label.
How can you tell who/what the individuals of a study are?
Who/what did they record data about? If you were to make a small table, what would each row be? Those are the individuals
What makes something a statistics problem?
Does it involve gathering or analyzing data instead of just reporting a fact or answer.
What is the difference between these vocab words: variables and values?
A variable is any characteristic observed (height, grade, gender, etc…). Values are any possible “answer” a variable could have (68 in., soph., male, etc…)
What is the difference between a one-way and two-way table?
A two-way table has two different variables. Each box has counts for how many individuals meet those variables. It would make sense to make a venn diagram of a two-way table.
What is the difference between frequency and relative frequency?
Frequency is just a count of how many individuals had a certain value. Relative frequency is just turning the count into a percent.
What is a marginal distribution?
It’s when you put the totals as a new column and row in a two-way table. It essentially turns a two-way table into two different one-way tables. The marginal distr. of “something” is the counts for that variable only. So the marginal distribution of grade-level would just be how many freshmen, soph…etc there were in the study.
What would count as evidence that there is an association between two variables?
If the conditional distributions (as a table or graph) are significantly different. If so, highlight the main differences.
What is a conditional distribution?
Finding the relative frequencies as if your population was only a certain value of a variable. Essentially, use a column or row total as the denominator instead of the total for the whole study.
What are some common ways that graphs can be misleading?
Not starting at 0. Pie charts not adding to 100%. 3D or pictures that make things look disproportionate.
What is a segmented bar graph?
A relative frequency graph with one bar that goes up to 100% and has individual bars stacked according to their percentages within it. It is useful for comparing multiple variables.
What is a mosaic plot?
It is a segmented bar graph where the x-axis is also a segmented bar graph. So the vertical bar for students would be wider than teachers if there were more students in the study.