1.1 Categorical Data Flashcards
Statistics
Statistics is a way of collecting and analyzing information or data to understand things better. It helps us make sense of numbers and figures by organizing them and finding patterns.
For example, if you want to know how many kids in a class have blue eyes, you can use statistics to count and compare the numbers.
Element
refers to a single item or individual that we are studying or collecting data about.
It could be a person, an object, or any other thing we want to learn more about.
For example, if we are studying the heights of students in a class, each student would be considered an element.
Variable
is something that can change or vary. It’s like having different options or choices.
Let’s say we want to see how much candy different kids eat in a week. The amount of candy each kid eats is a variable because it can be different for each child. Some kids might eat a lot of candy, while others might eat just a little. The variable here is the amount of candy each kid eats, and it can have different values depending on the child.
Data
refers to information or facts that we collect and use to learn about something.
It’s like having puzzle pieces that we put together to understand the whole picture.
For example, if we want to know what kinds of pets people have, we can collect data by asking them and recording their answers. The data would include information like “2 people have dogs, 3 people have cats, and 1 person has a fish.” We can use this data to learn about the different types of pets people own.
Dataset
is a collection of data.
It’s like having a bunch of different pieces of information that we put together. Imagine you have a toy box with different types of toys inside. Each toy represents a piece of data, like the color, shape, or size of something you’re studying. When you gather all those toys and put them together, you have a dataset. It helps us understand things better because we can see patterns or make comparisons based on the different pieces of information in the dataset.
Inference
is like making a guess or drawing a conclusion based on the information or data we have.
It’s like being a detective and using clues to figure something out. Let’s say you have a friend who always wears a hat to school. Based on this information, you can make an inference and guess that your friend likes hats. In statistics, we use inference to make predictions or draw conclusions about a whole group of things based on a smaller sample or subset of data. It helps us make educated guesses about things we don’t have direct information about.
Descriptive Statistics
Summarize collective data using graphs , charts, number lines, and percentages
Categorical(Qualitative) Data
Data whose values describe some characters of the element(case). They don’t necessarily have order or numbers.
- gender
- race
- year in school
- type of car you drive
- favorite color
*notice each has values that are verbal descriptions
Quantitive Data
Data that takes on numerical values as in count or measure, and is used to measured averages and ranges. There are two types:
Discrete: countable or listable values that take on one specific value (whole/counting numbers)
- # of accidents a year
- # of students in a class
- # of cars in parking lot
Continuous: data can assume infinitely many possible values in the domain or interval
- income
- height
- time
- weight
What tables are best for categorical data?
(I) Frequency Table
(II) Relative Frequency Table
(III) Cumulative Frequency Table
When to use a Pie Chart
Only use when you want to emphasize each category’s relation to the whole as each slice is a percentage and must total 100%
When to use a bar graph
Easy to read and compare quantities that are measured in the same unit
What are contingency tables?
Two-way tables to organize data into two or more categories