222) 3) Chapter 3: Descriptive Statistics Flashcards
What two key features quantitative variables describe numerically according to descriptive statistics?
The center of the data: typical observation. The variability of the data: the spread around the center.
Comparing qualitative categories using relative frequencies?
Relative frequency is the proportion or percentage of the observations that fall in that category.
Histograms?
A graph of a relative frequency distribution for a quantitative variable is called a histogram.
The Mean?
AVERAGE. The best known and most commonly used measure of the center is the mean. Mean is the sum of the observations divided by the number of observations.
What denotes this symbol?
Sample Size.
What is denoted by this symbol?
The variable.
What is denoted by this symbols? y1 y2 ….
Observations.
The n sample observations on a variable y are denoted by y1 for the first observation, y2 for the second and so forth.
What is denoted by this symbol?
Read as y-bar.
Denotes sample mean.
What says the definition of sample mean?
Sample mean is y-bar.
It says that:
Shortened expression for the samle mean of n observations?
Sample mean is equal to sum of all variables divided by number of observations.
What is outlier for the mean?
When one observation is much larger or smaller that the others, such as in highly skewed distrbutions.
Properties of the mean?
- Formula for the mean uses numerical values for the observations so it is appropriate only for quantitative variables.
- The mean can be higly influenced by an observation that falls well above or well below the bulk of the data, called an outlier.
What is weighted average for combined set of observations (2 sets of data with sample sizes n1;n2)
Expressed by sample mean multiplied by sample size of first data set + sample mean multiplied by sample size of second data set divided by sum of sample sizes of both data sets.
What is median?
The median is the observations that falls in the middle of the ordered sample. When the sample size n is odd, a single observation occurs in the middle. When the sample size is even two middle observations occur, and the median is the midpoint between two.
Index of middle observation?
Properties of the median?
- Median like a mean is appropriate for quantitative variables.
Also valid for ordinal-scale (requires oredered observations).
Not appropriate for nominal-scale data.
- For symmetric distributions median and the mean are identical
- For skewed distributions mean lies toward the direction of skew (longer tail) relative to the median.
- The median is insensitive to the distance of the observations (variable values).
- Is not affected by outliers.