Sampling Distributions Confidence Intervals Flashcards

1
Q

What is Statistical Inference?

A
  • Whenever a sample is selected to either learn something about, or draw conclusions regarding a larger group of items (Population)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Study the process of Inferential Statistics

A

https://docs.google.com/document/d/1r_ttbYs-4jXdkBbVGPH9vk1swjRRmRJUWdllcJXdaAI/edit?usp=sharing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What needs to be considered when doing Inferential Statistics?

A
  • If we calculate a statistic from a sample, will it exactly represent the population parameter (population value) we are interested in? (Sampling Error)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a sampling error?

A
  • An error in a statistical analysis arising from the unrepresentativeness of the sample taken
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What needs to be considered if the calculated statistics from an Inferential statistics study do not exactly represent the population parameter we are interested in?

A

If not then:
- Will the sample statistic underestimate or overestimate the population parameter?
- How large will any error be?
- Is it likely that the error will be small enough that the sample statistic will be useful?
We need to know something about the possible range of errors, and the likely size of errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A soft drink manufacturer sells one of its popular flavours in a 600mls bottle. Fill of soft drink it is normally distributed with a mean fill of 600mls and a standard deviation fill of 10mls. What is the probability that any one bottle will have less than 598mls, i.e. P(X < 598)

A
Given Information:
- m = 600 s = 10
- Normally Distributed
- We know each bottle is different, and we can work out the probability of getting amounts of fill.
- X is a random variable representing the bottle fill
- P(X<598) = P(z<598-600/10)
= P(z < -0.2)
= 0.4207
- This example is in the google doc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does Inferential Statistics work when using larger samples?

A
  • (Using the bottle example) Whilst we might ask questions about one bottle, we would never test the process using only one bottle.
  • First you would select a sample
  • Then calculate the sample mean fill
  • For example if you had three samples and each had n=25 and a different mean, then the sample mean would become a “random variable”.
  • Each sample has a different error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a sampling distribution?

A
  • A sampling distribution is the distribution of possible values any sample statistic may take or spread around the population parameter of interest
  • The sampling distribution also takes account of the distribution of possible sampling errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What three points are important to understand about Sampling Distributions?

A
  • Every sample statistic calculated is a random variable
  • Every random variable will have a distribution
  • If we can define the distribution then we can use it to answer questions such as that posed by the bottling process example.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you develop a Sampling Distribution?

A
  • Assume there is a population; Population size N=4; Random variable, X, is age of individuals; Values of X: 18, 20, 22, 24 (years)
  • First the mean age & St. Dev must be found
  • Then, the second set of observations can be analysed. This will give 16 possible samples (sampling with replacement). Resulting in 16 Sample Means
  • Next, the Sampling Distribution of All Sample Means must be found by finding the sample mean (in this case it would be all the 16 means added together and then being divided by 16) and by finding the St. Dev
  • This can then be graphed to display the results
  • Example on google doc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Standard Error of the Mean used for?

A
  • Different samples of the same size from the same population will yield different sample means
  • A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean
  • Example in google doc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is assumed when using the Standard Error of the Mean?

A
  • That sampling is done with replacement or sampling is done without replacement from a large or infinite population
  • Note that the standard error of the mean decreases as the sample size increases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is an easier way of finding the standard deviation of the sample means?

A
  • To divide the standard deviation of the original population by the square root of the number of observations
  • Google doc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens when the population is normal?

A
  • A Normal Population Distribution means that there will be Normal Sampling Distribution (They have the same mean)
  • This means that as n increases, the standard deviation of the sample means decreases
  • In other words, If a population is normal with mean (μ) and standard deviation (σ), the sampling distribution of X-bar is also normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How would you find the Z value for the Sampling Distribution of the mean? Also, study the example for how to find the Z value for the sampling distribution of the mean.

A

https://docs.google.com/document/d/1r_ttbYs-4jXdkBbVGPH9vk1swjRRmRJUWdllcJXdaAI/edit?usp=sharing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What happens if the population is not Normal?

A
  • If the Population Distribution is not normal then the sampling Distribution will become normal as n increases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the sampling distribution properties?

A
  • The mean sample mean equals the mean
  • The St. Dev of the sample mean equals the St. Dev divided by the square root of the number of observations
  • Images in the google doc
18
Q

What does the “Central Limit Theorem” state?

A
  • Regardless of the shape of individual values in the population distribution, as long as the sample size is large enough the sampling distribution of X-bar will be approximately normally distributed with the sampling distribution properties
19
Q

When do you apply the Central Limit Theorem?

A
  • If the Population is not Normal
20
Q

If the sampling distribution becomes more normal as the value of n increases, how many observations is enough to create a normal sampling distribution?

A
  • For most population distributions, n ≥ 30 will give a sampling distribution that is nearly normal
  • For fairly symmetric population distributions, n ≥ 5 is sufficient
  • For normal population distributions, the sampling distribution of the mean is always normally distributed
21
Q

Study the example for if a population is not normal.

A
  • Google doc
22
Q

How does the Estimation Process work?

A
  • If given a population with an unknown mean then a sample of that population will be used to identify the mean
  • This is done by finding the sample mean of that randomly selected sample
  • Once the sample mean has been found, the population mean can be estimated
  • For example, if a population with an unknown mean is looked at, you would first randomly select a sample and find its mean. Lets say that the sample mean is 50. You can then estimate that you are 95% sure that the population mean is between 40 and 60
23
Q

What is a point estimate?

A
  • A point estimate is the value of a single sample statistic
24
Q

What is a confidence interval?

A
  • A confidence interval provides a range of values constructed around the point estimate
25
Q

Study the diagram of the point estimate and confidence interval

A
  • google doc
26
Q

Why is a Confidence Interval Estimate beneficial?

A
  • An interval gives a range of values
  • Takes into consideration variation in sample statistics from sample to sample
  • Based on observations from 1 sample
  • Gives information about closeness to unknown population parameters
  • Stated in terms of level of confidence
  • Can never be 100% confident
27
Q

What is the general formula for all confidence intervals?

A

point estimate +/- (critical value)*(standard error)

28
Q

What are the common confidence levels?

A
  • Common confidence levels = 90%, 95% or 99%

- Also written (1-α) = 0.90, 0.95 or 0.99

29
Q

Why is confidence level a relative frequency interpretation?

A
  • In the long run, 90%, 95% or 99% of all the confidence intervals that can be constructed (in repeated samples) will contain the unknown true parameter
  • For example, if we were to randomly select 100 samples and use the results of each sample to construct 95% confidence intervals, approximately 95 out of 100 would contain the population mean
30
Q

What must you assume when looking for the Confidence Interval for μ, but the σ is known?

A

Assumptions:

  • Population standard deviation σ is known
  • Population is normally distributed
  • If population is not normal, use Central Limit Theorem
31
Q

Study the equation for finding the confidence interval estimate when the St. Dev is known.

A
  • Google Doc
32
Q

How would you find the Critical Value Z?

A

By using the Z value table

- Example in google doc

33
Q

What are the Z values for the confidence levels of 80%, 90%, 95% and 98%?

A
  • 80% = 1.28
  • 90% = 1.645
  • 95% = 1.96
  • 98% = 2.33
34
Q

What are the Z values for the confidence levels of 99%, 99.8% and 99.9%?

A
  • 99% = 2.576
  • 99.8% = 3.08
  • 99.9% = 3.27
35
Q

A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms. Determine a 95% confidence interval for the true mean resistance of the population

A
  • X-bar +/- Z*(σ/Square root of n)
    = 2.20 +/- 1.96 (0.35/square root of 11)
    = 2.20 +/- 0.2068
    = 1.9932 < μ <2.4068
  • We are 95% confident that the true mean resistance is between 1.9932 and 2.4068 ohms
  • Although the true mean may or may not be in this interval, 95% of intervals formed in this manner (in repeated samples) will contain the true mean
36
Q

What happens when we are trying to find the confidence interval for the mean but the St. Dev is unknown?

A
  • If the population standard deviation σ is unknown, we can substitute the sample standard deviation, S
  • This introduces extra uncertainty, since S is variable from sample to sample
  • So we use the t distribution instead of the normal distribution
37
Q

Study the equation for when we are trying to find the confidence interval for the mean but the St. Dev in unknown.

A
  • Google doc
38
Q

What does df stand for?

A
  • In statistics, the degrees of freedom (DF) indicate the number of independent values that can vary in an analysis without breaking any constraints.
39
Q

What are the shapes of t-distributions?

A
  • t-distributions are bell shaped and symmetric, but have ‘fatter’ tails than the normal
  • Note that: t —> Z as n increases
  • Example on google doc
40
Q

How do you use a t table?

A
  • First you need to find the df value, to do this you need to minus 1 from the sample size (n)
  • Next you need to find the area of α/2 because it’s in each tail
  • The value from α/2 will give lead you to a t value in the body of the t table which will give you your answer
41
Q

Study the example of how to use a t-table

A
  • Google doc
42
Q

A random sample of n = 25 has X-bar = 50 & S = 8. Form a 95% confidence interval for μ.

A
d.f = n-1 = 24
α/2 = 0.05/2 = 0.025
therefore, t = 2.0639 (from the t-Table)
- Hence, the confidence interval is: 50 +/- (2.0639)*8/square root of 25
= 46.698 < μ < 53.302
- This example on google doc