chapter 5 Flashcards
The mean of the sample means (of all possible samples) equals:
It cannot be determined because you don’t know the shape of the population
Twice the population mean
Half of the size of the population mean
The same size as the population mean
the same as the pop mean
What is the mean and standard deviation of the standard normal distribution?
mean = 1, standard deviation = 1
mean = 0, standard deviation = 0
mean = 0, standard deviation = 1
When comparing two normal distributions with the same mean, the density curve of the distribution with a smaller standard deviation will always:
Have a smaller median
Have a shorter height at the peak
Have a larger median
Have a larger height at the peak
mean=0
SD=1
Which of the following best describes a sampling distribution of a statistic?
It is the probability that the sample statistic equals the parameter of interest
Unselected
It is the histogram of sample statistics from all possible samples of all possible sizes
Unselected
It is the distribution of all the statistics calculated from all possible samples of the same size
Unselected
It is the probability distribution of all the values that are contained in all possible samples of the same size
It is the distribution of all the statistics calculated from all possible samples of the same size
Loosely speaking, what does the Central Limit Theorem say?
The area under a normal density curve is 1.
Measures of central tendency should always be computed with and without outliers.
Incorrect
The percentage of values that fall within 1 standard deviation of the mean is about 68%.
The sampling distribution of x is approximately normal for large sample sizes.
According to the Central Limit Theorem, if our sample size is over 30, the sampling distribution will be approximately a normal distribution regardless of the shape of the population (skewed or symmetric).
When comparing two normal distributions with the same mean, the density curve of the distribution with a smaller standard deviation will always:
Answer
incorrect
Unselected
Have a smaller median
Incorrect
You Were Sure and incorrect
Have a shorter height at the peak
Unselected
Have a larger median
Correct
The Correct Answer
Have a larger height at the peak
In a normal distribution, both the mean and the median are both at the middle of the distribution. So, the median for these two distributions will be the same since the means are equal. For the distribution with the smaller standard deviation, the peak will be taller in the center with a lot more narrow curve. The same percentage of data must be under the curves but with the smaller standard deviation the width would not be as large.
Larger height at peak
describes a sampling distribution of a statistic?
It is the distribution of all the statistics calculated from all possible samples of the same size
Which of the following characteristics does not apply to a theoretical normal distribution?
The distribution is bell-shaped.
The distribution is bimodal.
The area under the normal curve is equal to 1.
The mean and median are the same.
bimodal
he standard deviation of all possible sample means equals what?
The population standard deviation divided by the population mean
The square root of the population standard deviation
The population standard deviation
The population standard deviation divided by the square root of the sample size
The population standard deviation divided by the square root of the sample size
Which of the following best describes a sampling distribution of a statistic?
It is the probability distribution of all the values that are contained in all possible samples of the same size
It is the probability that the sample statistic equals the parameter of interest
It is the histogram of sample statistics from all possible samples of all possible sizes
It is the distribution of all the statistics calculated from all possible samples of the same size
It is the distribution of all the statistics calculated from all possible samples of the same size.”
A sampling distribution refers to the distribution of a sample statistic (such as the mean, proportion, or standard deviation) based on all possible samples of a given size drawn from a population. This concept is essential in inferential statistics, as it helps estimate population parameters and assess variability.
What is the z score formula
When do you use it?
z = (x - μ) / σ
x” is a data point, “μ” is the population mean, and “σ” is the population standard deviation; it essentially tells you how many standard deviations a data point is away from the mean, allowing you to compare data points from different distributions and identify outliers within a dataset; you use a z-score when you want to understand how far a specific data point is from the average value
Used in Normally distributed data
Steps
1. use formula
2. find values in table
3. subtract values
The mean length of a pregnancy is 267 days with a standard deviation of 10 days. Find the probability that a new mother will have a pregnancy last between 285 and 294 days.
0.0325
What is a continuous variable?
What is a continuous probability distribution?
What is the most important continuous probability distribution?
Continuous random variable has an infinite number of possible values that can be
represented by an interval on the number line.
Continuous probability, distribution, a statistical model that describes the likelihood of a random variable taking on a value within a specified range
The most important continuous probability distribution in statistics is the normal distribution,
where the area under the curve represents a probability of 1.00 total.
What are the properties of a normal distribution (4 things)
- Mean=media=mode
- bell shaped curve and is symmetric at the mean
- tot area under normal curve=1
- curve approaches but never touches the x-axis
- μ (mean) - μ (sd) and μ (mean) + μ (sd) shape inflexion points
How to Interpret Normal Distributions
A normal distribution can have any mean and any positive standard deviation.
The mean gives the location of the line of symmetry.
The standard deviation describes the spread of the data
Standard normal distribution
A normal distribution centered around a mean of 0 and a SD of
1. Total area under the curve is still 1.00. Often produced by converting to z-score.
Any x-value can be transformed into a
z-score by using the formula:
Value-Mean / Standard deviation
What table do we use to find the standard normal distribution?
Z score
Find the cumulative area that corresponds to a z-score of 1.15.
Start with row, then use column for z = 1.15
The area to the left of z = 1.15 is 0.875
What do you do if a value to the right of z, using the Standard Normal Table?
How do you know it’s to the right?
ex. Use the table to
find the area to the right of
z-score 1.23.
Because the Table is LEFT oriented, you need to subtract from 1 to get RIGHT area.
- First find area to the left of
z = 1.23 is 0.8907 - Subtract to find the area to the right of z = 1.23:
1 - 0.8907 = 0.1093.
How do you find the area between two z scores?
Use example: Use the table to find the area for the z-scores between -.75 and 1.23.
- Find the area corresponding to each z-score in the Standard Normal Table.
- The area to the
left of z = 1.23 is
0.8907.
- The area to the
left of z = -0.75 is
0.2266. - Subtract the smaller area from the larger area.
0.8907 (blue) - 0.2266 (yellow)
= 0.6641.2.
Note: Because Table is LEFT oriented, you need to subtract larger LEFT area from smaller LEFT area.
How do you convert an x value to as z score and back?
Any x-value can be transformed into a
z-score by using the formula:
Value-Mean / Standard deviation
A national study found that college students with jobs worked an average of 25
hours/week (SD = 11 hours). A college student with a job is selected at random. Find the probability that the student works for less than 5 hours per week. Assume that work hours in college students are normally distributed and are represented by the variable x.
(25-11)/5= -1.82 look in table for value
A survey indicates that for each trip to a supermarket, a shopper spends an average of 41 min. (SD = 12 min) in store. The lengths of time spent in the store are normally distributed.
1. Find the probability that a shopper will be in the store between 20 and 50 min. When 200 shoppers enter
the store, how many shoppers would you expect to be in the store for between 20 and 50 minutes.
Find the probability that a shopper will be in the store for > 35 min. When 200 shoppers enter the store,
how many shoppers would you expect to be in the store for >35 min
- 0.733
- 147 shoppers
- .692
- 138.3
In the US, the # of physicians involved in patient care per state are normally
distributed, with a mean of 280 physicians per 100,000 and a SD of 78. You randomly select a
state. What is the probability that the state has between 300 and 350 physicians per 100,000?
Interpretation: The probability that the state has between 300 and
350 physicians per 100,000 residents is about 21.3%.
Given a probability value, how do you find the z-score or random variable x
Example: Find the z-score that corresponds to a cumulative area (or probability value) of 0.3632
you must
scan the table to find the
exact/closest p-value for the
z-value
ex. -0.35
Find the z-score that has 10.75% of the distribution’s area to its right
Because the area to the right of this value is 0.1075, the cumulative
area to this point in the curve is 1 – 0.1075 = 0.8925
To transform a standard z-score to a data value x in a given population, use the
formula:
x = μ + zσ
Scores for the California Peace Officer Standards and Training test are normally
distributed, with a mean of 50 and a SD of 10. An agency will only hire applicants with scores in the
top 10%. What is the lowest score you can get and still be hired by the agency?
An exam score in the top 10% means
left area needs to be 0.900.
Find the z-score that corresponds to
a cumulative area of 0.900.
The closest table entry (i.e., area) in
Table 4 Appendix B is 0.8997, and the z-
score that corresponds to an area of 0.9
is about 1.28.
Using the equation x = μ + zσ
x = 50 + 1.28(10) ≈ 62.8
However, you need to round
up and answer the question
now. So lowest score is 63 to
still get hired.
What is sample distribution?
Sampling distribution = The probability distribution of a sample statistic that is formed when random samples of size n are repeatedly taken from a population.
For example, the distribution around the true population mean is the sampling
distribution of sample means
Sampling Distribution of Sample Means
The mean of the sample means, μx is equal to the population mean μ.
The standard deviation of the sample means, Ox, is equal to the population standard deviation, σ divided by the square root of the
sample size, n.
This is a special
standard deviation
called the standard
error of the mean.
What is the Central Limit Theorem (CLT)?
What happens when the population is already normal?
The CLT states that if you take many random samples (size n ≥ 30) from any population (even if it’s not normally distributed), the average of those samples will follow a normal distribution. The larger the sample size, the more normal the distribution becomes.
If the original population is already normally distributed, then the sample means will always follow a normal distribution, no matter the sample size.
What are the properties of the Sampling Distribution of the Sample Mean
Mean of the sample means (μₓ̄) = Population mean (μ)
Standard deviation of the sample means (σₓ̄) = Population standard deviation (σ) divided by the square root of the sample size (n):
σxˉ=σ\square root of n
Variance of the sample means = Population variance (σ²) divided by n:
σ2/x= σ^2/n
Larger samples make the distribution narrower → The more data points in a sample, the closer the sample mean is to the true population mean
Interpreting the Central Limit
Theorem:
The study found that the mean sleep time was 6.9 hours, with a SD of 1.5
hours. Random samples of 100 sleep times are drawn from this population, and the mean of each sample is determined. Find the mean and SE of the sampling distribution of sample means.
All samples are > 30 this means clt aplies and mean of sampling distribution= population mean
μx=μ=6.9
SE of the mean= pop sd/square root of n= 0.15
The standard of error was corected cause this is a big sample
very little variability
The training heart rates of all 20-years old athletes are normally distributed, with a mean of
135 beats per minute and SD of 18 beats per minute. Random samples of size 4 are drawn
from this population, and the mean of each sample is determined. Find the mean and
standard error of the mean of the sampling distribution
Assume normally distributed, so same thing.
The mean of the sample means:
μx=μ=135 135 beats per minute
The standard deviation of the sample means (aka SE):
SE of the mean= pop sd/square root of n= 18/ sq4 =9 beats per min
small sample= less acurate correction
it will also be a wider distribution
The figure shows the mean distances traveled by drivers each day. You
randomly select 50 drivers ages 16 to 19 (mean in this age category =
20.7). What is the probability that the mean distance traveled each day
is between 19.4 and 22.5 miles? Assume
mean= 6.5 miles
The samples are > 30 clt applies
mean= 20.7 miles and
SD= 0.9 miles
to transform into z score…
z= value- mean / standard error
19.4 - 20.07/6.5 * sq50
22.5- 20.07/6.5 * sq50
look up answers in table subtract big from small
0.8957
What is Normal Approximation for a Binomial Distribution?
The Binomial Distribution deals with situations where there are only two outcomes (success or failure) in repeated trials (like flipping a coin).
If the sample size (n) is large enough, the binomial distribution starts to look like a normal distribution.
This allows us to use normal probability methods to approximate binomial probabilities.
Normal Approximation for Binomial Distributions
When Can We Use This Approximation?
The Central Limit Theorem (CLT) applies to binomial distributions too!
If both np ≥ 5 and nq ≥ 5, where:
n = number of trials
p = probability of success
q = probability of failure (q = 1 - p)
Then, the binomial distribution can be approximated by a normal distribution.
Mean and Standard Deviation for Normal Approximation
and why it works
Mean and Standard Deviation for Normal Approximation
Mean (μ):
μ=np
Standard Deviation (σ):
σ= sq (npq)
Why Does This Work?
As n increases, the binomial distribution gets smoother and more symmetric, making it closer to a normal distribution.
This makes calculations easier because normal probability tables (Z-tables) can be used instead of binomial formulas.
Determine whether you can use a normal distribution to approximate the distribution of x, the
number of people who reply yes. If you can, find the mean and standard deviation
In a survey of high schools in a certain state, it was reported that 40% of
students failed at least one class taken through distance learning. You
randomly select 45 students from that state and ask them whether they
failed at least one class taken through distance learning.
Check parameters using n = 45, p = 0.40, and q = 0.60.
np = 45(0.40) = 18
nq = 45(0.60) = 27.
Because np and nq are greater than 5, you can use a normal distribution to approx. the
distribution of x, with
= np = 18 and
= = 3.29
Correction for Continuity
A binomial distribution is discrete and is represented
by a probability histogram.
To calculate exact binomial probabilities, the
binomial formula is used for each value of x and the
results are added.
However, when you approximate a normal
distribution from a binomial one, you need to move 0.5
unit to the left and right of the midpoint
to include all possible x-values in the interval of x
(continuity correction).
So, if I’m interested in P(x) = 4 at discrete level, I’m
really talking about P(x) = 3.5 to 4.5 at the normal
distribution level.
Use a continuity correction to convert each
binomial probability to normal distribution
probability.
1. The probability of getting between 270 and 310 successes, inclusive
The corresponding interval for the continuous normal
distribution is 269.5 < x < 310.5. The normal distribution
probability is P(269.5 < x < 310.5)
Correction for Continuity in Normal Approximation of a Binomial Distribution explained
When using a normal distribution to approximate a binomial distribution, we apply a continuity correction to improve accuracy. This is because:
A binomial distribution is discrete (it only takes whole number values, like 0, 1, 2, etc.).
A normal distribution is continuous (it includes all real numbers).
To make the approximation more accurate, we adjust for this difference by adding or subtracting 0.5 when converting a binomial probability to a normal probability. This is called the continuity correction.
a z-score to be unusual if it is less than …. or greater than ….
less than -2 or greater than 2.
Assume the random variable x is normally distributed with mean=89 and standard deviation sigma=5. Find the indicated probability.
P(x<83)
0.1151
When finding probability, what do you do if
P(x<#)
vs
P(x>#)
P(x>#)
- use formual
x-mean/sd
-find in z table
- Subtract one
The amounts of time per workout an athlete uses a stairclimber are normally distributed, with a mean of 20 minutes and a standard deviation of 5 minutes. Find the probability that a randomly selected athlete uses a stairclimber for (a) less than 16 minutes, (b) between 20 and 27 minutes, and (c) more than 30 minute
a.)P(x<16) 0.2119
Use the standard normal table to find the z-score that corresponds to the given percentile. If the area is not in the table, use the entry closest to the area. If the area is halfway between two entries, use the z-score halfway between the corresponding z-scores. If convenient, use technology to find the z-score.
Upper P 75
0.67
The weights of bags of baby carrots are normally distributed, with a mean of 32 ounces and a standard deviation of 0.32 ounce. Bags in the upper 4.5% are too heavy and must be repackaged. What is the most a bag of baby carrots can weigh and not need to be repackaged?
- 95.5% are ok if only the upper 4.5% arent
- look for 0.955 in table
- value that is closest gives you z
- sub into formula
z=x-32/o.32 - solve for x
Ans= 32.54
A population has a mean =139 and a standard deviation =25. Find the mean and standard deviation of the sampling distribution of sample means with sample size n=53.
139,
3.434
***Reminder: if sample size >30 mean sample=mean pop
sd of mean= mean/squ n
he lengths of lumber a machine cuts are normally distributed with a mean of 87 inches and a standard deviation of 0.6 inch.
(a) What is the probability that a randomly selected board cut by the machine has a length greater than 87.23 inches?
(b) A sample of 42 boards is randomly selected. What is the probability that their mean length is greater than 87.23 inches?
(a) The probability is 0.352.
(b) The probability is 0.0066.
The sample size n, probability of success p, and probability of failure q are given for a binomial experiment. Decide whether you can use the normal distribution to approximate the random variable x.
n=16
p=0.25
q=0.75
Can the normal distribution be used to approximate the random variable x?
A.Yes comma because np greater than or equals 5 and nq greater than or equals 5.
Yes comma because np greater than or equals 5 and nq greater than or equals 5.
B.
No, because npless than5 and nqless than5.
C.No comma because np less than 5.
No comma because np less than 5.
D.No comma because nq less than 5.
No comma because nq less than 5.
c
Use the correction for continuity and determine the normal probability statement that corresponds to the binomial probability statement.
Binomial Probability P(x>=109)
Which of the following is the normal probability statement that corresponds to the binomial probability statement?
A.
P(xgreater than or equals108.5)
B.
P(xless than or equals109.5)
C.
P(xgreater than or equals109.5)
D.
P(xless than or equals108.5)
Since we are using a continuous normal distribution to approximate a discrete binomial distribution, we adjust
x=109 by subtracting 0.5 to include the entire probability mass of 109 and above.
Decide whether you can use the normal distribution to approximate the binomial distribution. If you can, find the mean and standard deviation. If you cannot, explain why.
A survey of adults found that 48% have used a multivitamin in the past 12 months. You randomly select 40 adults and ask them if they have used a multivitamin in the past 12 months.
A.
No, because np<5.
B.
No, because nq<5.
C.
Yes, the mean is
and the standard deviation is
(Round to two decimal places as needed.)
mean= np
sd= squ npq
np or nq is less than 5 you cant use nominal distribution