1: What is Stats, Frequency Data, Numerical Measures Flashcards
What are three reasons to study statistics?
- ___
- ___
- ___
(1) data requires statistical knowledge to make the information useful
(2) statistical techniques are used to make professional and personal decisions
(3) you will need a knowledge of statistics in any career
___ is the set of knowledge and skills used to organize, summarize, and analyze data; the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions.
Statistics
What are the two types of statistics?
(1) descriptive statistics
(2) inferential statistics
___ are methods of organizing, summarizing, and presenting data in an informative way.
Descriptive statistics
___ are methods used to estimate a property of a population on the basis of a sample.
Inferential statistics
A ___ is an entire set of individuals or objects of interest or the measurements obtained from all individuals or objects of interest.
Population
A ___ is a portion, or part, of the population of interest.
Sample
What are the two types of variables in statistics?
I.e., In car sales?
(1) qualitative (location of sale and type of vehicle sold)
2) quantitative (age of buyer, profit earned on the sale of vehicle, and number of previous purchases
Observations of a___ variable can assume any value within a specific range.
I.e., How much milk in a glass, temperature in New Orleans.
Continuous
___ variables can assume only certain values, and there are “gaps” between the values.
I.e., Weight of a box of apples, number of customers who use the ATM in a day
Discrete
What are the four levels of measurements in statistics?
(1) nominal
(2) ordinal
(3) interval
(4) ratio
___ level of measurement are data recorded and measured and represented as labels or names. They have no order. They can only be classified and counted.
I.e., Types of fruit in a grocery
Nominal level of measurement
___ level of measurement are data based on a relative ranking or rating of items based on a defined attribute or qualitative variable. They can only ranked or counted.
I.e., Movie ratings
Ordinal level of measurement
___ level of measurement are the distance between values based on a scale with a known unit of measurement.
I.e., Shoe size
Interval level of measurement
___ level of measurement are data based on a scale with a known unit of measurement and a meaningful interpretation of zero on the scale.
I.e., Stock prices
Ratio level of measurement
___ is a grouping of qualitative data into mutually exclusive and collectively exhaustive classes showing the number of observations in each class; a relative frequency captures the relationship between a class frequency and the total number of observations.
Frequency tables
A ___ is a graph that shows qualitative classes on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are proportional to the heights of the bars.
Bar graph
A ___ is a chart that shows the proportion or percentage that each class represents of the total number of frequencies.
Pie chart
A ___ is a grouping of quantitative data into mutually exclusive and collectively exhaustive classes showing the number of observations in each class.
Frequency distribution
___ is simply a listing of the individual and observed profits.
Raw or ungrouped data
The number of observations in each class is called the ___.
Class frequency
___ is halfway between the lower or upper limits of two consecutive classes; computed by adding the lower or upper limits of consecutive classes and dividing by 2.
Class midpoint
A ___ is a graph in which the classes are marked on the y-axis (represented by the heights of the bars) and the class frequencies on the x-axis. Can observe shape of distribution, concentration and spread of data, and approx. number of observations. Used for quantitative variables such as the interval and ratio levels.
Histogram
What are the four steps of constructing a frequency distribution?
(1) Decide on number of classes
(2) Determine class width
(3) Set individual class limits
(4) Tally number of observations in each class
A useful recipe to determine the number of classes (k) is the ___ rule. (K) is the largest/smallest number.
“2 to the k rule”
2 raised to the power of k where K is the smallest number
\_\_\_ allows us to compare, directly, two or more frequency distributions, shows the shape of a distribution, and consists of line segments connecting the points formed by the intersections of the class midpoints and the class frequencies. X-axis = midpoint of each class Y-axis = class frequencies.
Frequency polygon
___ is a type of frequency polygon that shows cumulative frequencies. In other words, the cumulative percents are added on the graph from left to right less the upper limits of the class.
Cumulative frequency polygon
___ is the sum of the class and all classes below it in a frequency distribution. In other words, adding up a value and all of the values that came before it.
Cumulative frequency distribution
___
?? a relative frequency captures the relationship between a class frequency and the total number of observations.
Cumulative relative frequency polygon
___
?? a relative frequency captures the relationship between a class frequency and the total number of observations.
Cumulative relative frequency distribution
___ occurs when classes do not overlap.
Mutually exclusive classes
___ occurs when there is a class for each observation.
Collectively exhaustive classes
___ are a single value that is typical of the data which pinpoints the center of a distribution of data; also called averages; arithmetic mean, weighted mean, median, mode, and geometric mean.
Measures of location
___ are values which show the spread of data set; range, variance, and standard deviation.
Measures of dispersion
Dispersion is also called ___ or ___.
Variation
Spread
A ___ is a characteristic of a population.
Parameter
A ___ is a characteristic of a sample.
Statistic
What are the major properties of a mean?
- ___
- ___
- ___
- ___
- The data must be measured at the interval or ratio level
- All the values are included in computing the mean
- The mean is unique
- The sum of the deviations of each value from the mean is zero
The ___ is the midpoint of the values after they have been ordered from low to high values; must be at least an ordinal level of measurement
Median
What are the major properties of a median?
- ___
- ___
- It is not affected by extremely large or small values
2. It can be computed for ordinal-level data or higher
The ___ is the value of the observation that appears most frequently; not affected by extreme high or low values.
Mode
No MODE if no value appears more than once
A mode can be ___ when there are two modes.
I.e., 27 and 31 appear more than once.
Bimodal
A ___ occurs when a distribution is non-symmetrical; the relationship among the measures changes.
Skewed distribution
The ___ is a convenient way to compute the arithmetic mean when there are several observations of the same value; add all observations and divide by the total number.
I.e., 1+2+3+4+5/5
Weighted mean
The ___ is useful in finding the average change of percentages, ratios, indexes, or growth rates over time; often find the percentage changes in sales, salaries, or economic figures.
Geometric mean
What are the two reasons to study dispersion?
- ___
- ___
1.
2. To compare the spread in two or more distribution
A ___ is the simplest measure of dispersion; the difference between the high and low values in a data set; it does not take into consideration all of the values and is therefore its limitation.
Range
___ measures the mean amount by which the values in a population, or sample, vary from their mean.
Variance
___ is the mean of the squared difference between each value and the mean; for populations whose values are near the mean, the variance will be small; for populations whose values are dispersed away from the mean, the population variance will be large.
Overcome the weakness of the range by using all the values in the population, not just high and low.
Population variance
The square root of the population variance is the ___.
Population standard deviation
A ___ is used to calculate how varied a sample is by the average of the squared differences from the mean.
Sample variance
The ___ is used as an estimator of the population standard deviation and is the square root of the sample variance.
Sample standard deviation
___ states that regardless of the shape of the distribution, at least 1 − 1/k2 of the observations will be within k standard deviations of the mean, where k is greater than 1.
Chebyshev’s theorem
The ___ states that for a bell-shaped distribution about 68% of the values will be within one standard deviation of the mean, 95% within two, and virtually all within three.
Empirical Rule
___ is an estimate of the corresponding actual values; statistical software packages make it easy to calculate these values, even for large data sets.
Standard deviation of grouped data