7.1: Describing Data Sets Flashcards
Describe the term “statistics”.
Statistics is used to refer to data and the methods used to analyze data.
The two categories of statistical methods? Describe.
- Descriptive statistics summarize the important characteristics of large data sets.
- Inferential statistics pertain to procedures used to make forecasts, estimates or judgments about a large set of data on the basis of a sample.
What is a population?
A population is the set of all possible members of a stated group.
What is a sample?
A sample is defined as a subset of the population of interest. Sample is drawn from the population, and the sample’s characteristics can be used to describe the population as a whole.
What are the four types of measurement scales? NOIR
- Nominal scales (least information, no particular order)
- Ordinal scales (relative ranking ie. #1 as best #10 as worst)
- Interval scales (relative ranking + assurance that differences between scale values are equal ie. temperature)
- Ratio scales (relative ranking + equal differences between scale values + true zero point as the origin ie. money)
What is a parameter?
A parameter is a measure used to describe a characteristic of a population (like mean return/standard deviation).
What is a sample statistic?
A sample statistic is a measure used to describe a characteristic of a sample.
What is a frequency distribution?
A frequency distribution is a tabular presentation of statistical data that aids the analysis of large data sets, which summarizes statistical data by assigning it to specific groups/intervals/classes. Data employed with a frequency distribution may be measured using any type of measurement scale.
What are the 3 steps (DTC) used to construct a frequency distribution? What are the 5 conditions of the first step?
Step 1: Define the intervals. Conditions for the range values for each interval:
- Lower and upper limit
- All inclusive
- Nonoverlapping
- Mutually exclusive
- The total set of intervals should cover the total range of values for the entire population
Step 2: Tally the observations (assign to appropriate interval).
Step 2: Count the observations (the number of observations that are assigned to each interval).
What is absolute frequency?
Absolute frequency is the actual number of observations that fall within a given interval.
What is modal interval?
Modal frequency is the interval with the greatest frequency in a frequency distribution.
What is relative frequency?
Relative frequency is the percentage of total observations falling within each interval, calculated by the absolute frequency in an interval/total number of observations.
What is cumulative absolute frequency?
For any given interval, cumulative absolute frequency is the sum of the absolute or relative frequencies up to and including the given interval.
What is cumulative relative frequency?
Cumulative relative frequency the sum of relative frequencies starting at the lowest interval to the highest (up to and including the given interval).
What is a histogram? What is its attractive feature? What is on the x-axis? Y-axis?
Histogram is a graphical presentation of absolute frequency distribution presented as a bar chart of continuous data that has been classified into a frequency distribution.
Attractive feature is that it allows us to see where most of the observations are concentrated.
X-axis is interval. Y-axis is frequency.