Exam 1 Flashcards
Cases
the objects described by a set of data. These can be companies, customers, etc. Basically anything that is described by the set of data collected.
Variable
A special characteristic of a case. Such as age, sex, gender, etc. This is a specific attribute of the cases/individual.
Categorical variables
variables which describe the characteristic of the individual or category.
Quantitative variables
variables which takes numerical values on which arithmetic operations make sense.
Data set
the collection of raw statistics and information collected by a research study.
Graphs for categorial variables examples
Bar graphs & pie charts
Graphs for quantitative variables examples
Stemplots and Histograms
Individual
Object described by data.
We have a data set where the cases are college students. One of the variables in the data set is “gender.” The values of gender are 1 if the student is male and 2 if the student is female. What type of variable is gender?
Categorical(because it doesn’t make sense to do any arithmetic with group 1 and group 2).
Units of measurement are an important part of the description of what type of variables?
Quantitative
The first day of class, the professor collects information on each student to make a data set that will be analyzed throughout the semester. The information asked includes hometown, GPA, number of classes taking, number of siblings, and favorite subject. How many quantitative variables are in this data set?
three (number of classes, GPA, number of siblings. All of these can be averaged and/or put through other arithmetic operations).
Consider the following data, which describe the amount of time in minutes that students spend studying for a quiz:
10, 11, 11, 12, 12, 14, 15, 18, 19, 20, 22, 24, 39, 40, 41, 44, 46, 50, 52, 52, 53, 55, 70
What numbers make up the leaf of the first stem?
0,1,1,2,2,4,5,8,9
classes
intervals on histograms of equal width.
Right skewed
When the right side of the distribution is “longer” aka not as large as the left.
What is a time series?
A data set which tracks a sample overtime.
When examining a distribution of a quantitative variable, which of the following features do we look for?
- Overall shape, center, and spread
- Symmetry or skewness
- Deviations from overall patterns, such as outliers
- The number of peaks or modes
Pie charts are useful for…
Seeing which part of the whole a group forms.
What is the interquartile range for a set of data?
The interquartile range is the difference between the 75th and 25th percentiles of data.
IQR = Q3 - Q1
A reporter wishes to portray baseball players as overpaid. Which measure of center should he report as the average salary of major league players?
The mean, as higher pays will pull the mean in their direction.
If the median is much larger than the mean, this is indicative of what type of stemplot?
The stemplot of the data would be skewed left
If the skew is symmetrical, what can be said about the median and mean?
They are equal
What does the five number summary consist of for a set of data?
- Minimum
- Q1
- Median
- Q3
- Maximum
Define Histogram
a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval.
A boxplot is
A graph of the five number summary.
Box plot’s central box spans Q1 and Q3 with a line in the box indicating the median and lines extend out from the box to the smallest and largest observations.
What is the 1.5-IQR formula?
Outlier = 1.5 * IQR