STAT Ch 1 Flashcards
Population
a well-defined collection of objects
Data
collections of facts
Census
When desired information is available for all objects in the population
Sample
a subset of the population
Variable
any characteristic whose value may change from one object to another
univariate data set
consists of observations on a single variable. For example, we might determine the type of transmission, automatic (A) or manual (M), on each of ten automobiles recently purchased at a certain dealership, resulting in the categorical data set: M A A A M A A M A A
bivariate data
when observations are made on each of two variables
Multivariate data
when observations are made on more than one variable
descriptive statistics
summarizing and describing important features of the data e.g. A graph or a mean
Inferential statistics
Techniques for generalizing from a sample to a population
hypothetical population
the population as consisting of all possible data that might be made under similar experimental conditions
confidence interval or interval estimate
Estimate of the population mean
lower prediction bound
Estimate of a single data point
The relationship between probability and inferential statistics
probability reasons from the population to the sample (deductive reasoning), whereas inferential statistics reasons from the sample to the population (inductive reasoning)
Enumerative studies
interest is focused on a finite, identifiable, unchanging collection of individuals or objects that make up a population
Sampling frame
a listing of the individuals or objects to be sampled
Analytic study
A study that is not enumerative in nature
simple random sample
This is a sample for which any particular subset of the specified size (e.g., a sample of size 100) has the same chance of being selected
stratified sampling
entails separating the population units into nonoverlapping groups and taking a sample from each one
Sample size
The number of observations in a single sample, often denoted by “n”
Truncating
To make the numbers in a set all shorter by the same amount
Dot plot
an attractive summary of numerical data when the data set is reasonably small or there are relatively few distinct data values. Each observation is represented by a dot above the corresponding location on a horizontal measurement scale. Whena value occurs more than once, there is a dot for each occurrence, and these dots are stacked vertically. As with a stem-and-leaf display, a dotplot gives information about location, spread, extremes, and gaps.
Discrete
A numerical variable is called this if its set of possible values either is finite or else can be listed in an infinite sequence
Continuous
A numerical variable is called this if its possible values consist of an entire interval on the number line.
Frequency
the number of times that a value occurs in the data set
Relative frequency
the fraction or proportion of times the value occurs ( number of times the value occurs/ the number of observations in the data set)
Frequency distribution
a tabulation of the frequencies and/or relative frequencies
unimodal histogram
A histogram that rises to a single peak and then declines
bimodal histogram
A histogram that has two different peaks
multimodal
A histogram with more than two peaks
When is a histogram symmetric?
if the left half is a mirror image of the right half
When is a unimodal histogram positively skewed?
if the right or upper tail is stretched out compared with the left or lower tail
When is a unimodal histogram negatively skewed?
if the left or lower tail is stretched out compared with the right or upper tail
qualitative
Categorical
Mean
The arithmetic average of the set. Often referred to as the sample mean and represented by x̄.
point estimate
a single number that is our “best” guess
Population mean
The average of all values in the population. Denoted as μ. When there are N values in the population (a finite population), then μ= sum of the N population values/N.
median
the middle value once the observations are ordered from smallest to largest. Sample median is denoted as x-tilde
Range
the difference between the largest and smallest sample values
population median
a middle value in the population. Denoted as μ-tilde
deviations from the mean
Obtained by subtracting x̄ from each of the n sample observations. The average deviation is always zero.
sample variance
