Penn State Stats Review Flashcards
Define a population
A population is any large collection of objects or individuals, such as Americans, students, or trees about which information is desired.
https://newonlinecourses.science.psu.edu/statprogram/reviews/statistical-concepts/terminology
Define a parameter
A parameter is any summary number, like an average or percentage, that describes the entire population.
What is the symbol for and pronunciation of the population mean?
μ (the greek letter “mu”)
Ex We might be interested in learning about , the average weight of all middle-aged female Americans. The population consists of all middle-aged female Americans, and the parameter is µ.
What is the population proportion?
Symbol p
Ex. We might be interested in learning about p, the proportion of likely American voters approving of the president’s job performance. The population comprises all likely American voters, and the parameter is p.
Define a sample
A sample is a representative group drawn from the population.
Define a statistic
A statistic is any summary number, like an average or percentage, that describes the sample.
What is the symbol for the sample mean?
X-bar, or x̄
We might use x̄, the average weight of a random sample of 100 middle-aged female Americans, to estimate µ, the average weight of all middle-aged female Americans.
What is the symbol for the sample proportion?
P-hat or p̂
We might use p̂, the proportion in a random sample of 1000 likely American voters who approve of the president’s job performance, to estimate p, the proportion of all likely American voters who approve of the president’s job performance.
How do you learn about a population parameter?
Two ways
1) We can use CONFIDENCE INTERVALS to estimate parameters.
“We can be 95% confident that the proportion of Penn State students who have a tattoo is between 5.1% and 15.3%.”
2) We can use HYPOTHESIS TESTS to test and ultimately draw conclusions about the value of a parameter.
“There is enough statistical evidence to conclude that the mean normal body temperature of adults is lower than 98.6 degrees F.”
What is the principle behind confidence intervals?
uppose we want to estimate an actual population mean . As you know, we can only obtain , the mean of a sample randomly selected from the population of interest. We can use to find a range of values:
Lower value < population mean (μ) < Upper value
that we can be really confident contains the population mean . The range of values is called a “confidence interval.”
In general, the narrower the confidence interval, the more information we have about the value of the population parameter. Therefore, we want all of our confidence intervals to be as narrow as possible.
Define the general form for most confidence intervals
Sample estimate +/- margin of error
Define the t-interval for population mean
The formula for the confidence interval in words is
Sample mean +/- (t-multiplier x standard error)
The quantity to the right of the ± sign, i.e., “t-multiplier × standard error,” is just a more specific form of the margin of error. That is, the margin of error in estimating a population mean µ is calculated by multiplying the t-multiplier by the standard error of the sample mean.
The formula is only appropriate if a certain assumption is met, namely that the data are normally distributed.
Define the t-multiplier
Denoted as (symbols after t are subscript):
t α/2,n-1
Depends on the sample size through n - 1 (called the “degrees of freedom”) and the confidence level (1-α) x 100 through α/2
Define “degrees of freedom”
n-1
n = sample size
Another way to say this is that the number of degrees of freedom equals the number of “observations” minus the number of required relations among the observations (e.g., the number of parameter estimates). For a 1-sample t-test, one degree of freedom is spent estimating the mean, and the remaining n - 1 degrees of freedom estimate variability.
Define standard error
The “standard error,” which is s divided by square root of n, quantifies how much the sample means vary from sample to sample. That is, the standard error is just another name for the estimated standard deviation of all the possible sample means.