Class 2 Spring π· Flashcards
What are the types of variables considered when buying a second-hand bicycle?
- Make (brand)
- Type (roadbike, hybrid, mountain bike, etc.)
- Number of gears
- Size of frame
- Color
- Age
- Condition (excellent/good/poor)
- Price
What are the two main classifications of variables?
- Categorical
- Numerical
What is a frequency table used for?
To sort and summarize data to make sense of them.
Why is it important to convert raw counts to percentages in categorical data?
To make the big picture clearer.
What is a grouped frequency distribution?
An arrangement that clarifies the pattern of data while sacrificing some detail.
What is a histogram?
A graphical representation made from a grouped frequency distribution that shows patterns clearly.
What is the primary purpose of bar charts?
To display distributions of categorical variables.
What differentiates histograms from bar charts?
- Histograms use bins for numerical data
- Bar charts display categorical data without binning.
What are the measures of central tendency for numerical data?
- Mean
- Median
What is the mode, and when is it primarily useful?
The value with the most occurrences; primarily useful for categorical data.
What are the measures of dispersion?
- Range
- Standard deviation
- Interquartile Range (IQR)
How is the mean calculated in continuous distributions?
By taking each value times the probability that x takes that value.
What does central tendency describe in a dataset?
Where the βmiddleβ of the dataset is.
What is dispersion in the context of data analysis?
How spread out or βwideβ the dataset is.
What are the key questions to answer when describing a dataset?
- Central Tendency: where is the βmiddleβ?
- Dispersion: how spread out is the data?
Fill in the blank: A _______ is made up of groups of data called bins.
[histogram]
True or False: Bar charts can display numerical variables.
False
What does a time series graph represent?
Historical data plotted over time.
What is the significance of the x-axis in a histogram?
It is a number line, and the order of the bars cannot be changed.
What is the relationship between higher bars in bar charts and histograms?
Higher bars indicate higher counts or greater probability of occurrence.
What is the trade-off when choosing between showing detail and overall patterns in statistics?
Some detail is sacrificed to clarify the overall pattern.
What is the function of R in data visualization?
To create visual representations of data easily.