Exam 2-3 Flashcards
Sample
A subset of individuals in the population; the group about which we actually collect information.
Central Limit Theorem
When sampling from a non-Normal population, the sampling distribution of x bar is approximately Normal whenever the sample is large and random
Theoretical sampling distribution of x bar
The distribution of all possible samples of the same size from the same population
Approximate sampling distribution of x bar
The distribution of x bar values obtained from repeatedly taking SRS’s of the same size from the same population
Sampling distribution of x bar
- Center
- Spread
- Shape
- mean of x bar = population mean valid for all sample sizes and populations of all shapes
- Stand.deviation of x bar= stand dev of population decided by the square root of n
- Normal -shape of x bar distribution is exactly normal for any n; Non-Normal - shape of sampling distribution of x bar is approximately normal when n (sample size) is large
Facts about Sampling Distribution of x bar
- Mean= mu regardless of population shape or sample size
- standard dev of x bar is always less than the standard deviation of the population for samples of any size where n>1
- standard dev of x bar gets smaller as n increases at rate square root of n. To cut stand dev in half, quadruple sample size
- Shape is normal if population is normal for any sample size
- shape is approximately Normal if we take a large random sample from a non-normal population
Standard deviation if x bar ( standard deviation of the sampling distribution of x bar)
A measure of variability of the values of the statistic x bar about mu ; a measure of the variability of the sampling distribution of x bar; in other words the average amount that statistic (x bar) deviates from it’s mean. Computed as sigma over square root of n
Predicting sampling distribution of x bar
Take only one sample of size n
Use results to make inference about the population
Because mean =mu and standard deviation of x bar= sigma over square root of n; and the shape is approx Normal if sample is random and large according to CLT
R-sq is a fraction of
Variation in the values of y that is explained by the least squares regression of y on x
Outlier in y direction of a Scatterplot have …… Residuals but other outliers need not to have large residuals
Large
Influential observations in x direction of Scatterplot are often ….. For the least-squares regression line
Influential
To add categorical variable to Scatterplot
Add different color or symbol for each category