Module 7, Confidence Intervals Flashcards
Confidence Level:
how likely our parameter is to lie within that interval ^^;
Margin of Error:
estimate of the maximum amount of difference we think is possible between our statistic and its corresponding parameter
Why is a confidence interval inferential?
An inferential stat because sample stats are used to estimate the location of population parameters (GENERALIZING)
Point vs. interval estimate
Point Estimate: single statistic that is used as our best estimate of a corresponding parameter (the value in a population from which you drew that estimate), GREATER PRECISION, LESS ACCURACY (very specific single stat)
Have extensive precision: give us a specific figure
DISADVANTAGE OF POINT ESTIMATE: low accuracy, low freedom from error/the extent to which our estimate differs from the parameter of interest’s true value
Interval Estimate: provides us with a RANGE of values within which the population parameter is likely to fall; GREATER ACCURACY, LESS PRECISION, this range is more accurate but not precise because of how the big the range can be
Sampling Theory and level of Confidence
Sampling Theory: a well-drawn probability sample, where every element has a known probability of being selected, normally involving a random mechanism, will yield results that closely resemble those we would get if we measured the ENTIRE population
Law of Large Numbers:
states that with a sufficiently large sample size, sample statistics (usually means) will tend to approximate population parameters very closely (THEY SHOULD BE VERY SIMILIAR
Confidence Level:
specifies the probability that our particular sample’s interval estimate will contain the population parameter
95% Confidence Level:
if a large number of samples is collected and a confidence interval is created for each sample, approximately 95% of these intervals will contain the population mean
Confidence interval vs. confidence level
Confidence Interval: range of values within which we estimate that a population parameter will fall (LOWER BOUND AND UPPER BOUND OF AN INTERVAL WITHIN WHICH WE ESTIMATE A POPULATION PARAMETER WILL FALL, IE. MEAN NUMBER OF FRIENDS IS BETWEEN 200-240)
Confidence Level: specifies the PROBABILITY that the population parameter will lie within that specified range of values
What does a higher level of confidence mean?
HIGHER LEVEL OF CONFIDENCE = WIDER THE CONFIDENCE INTERVAL
LOWER LEVELS OF CONFIDENCE: HAVE A SMALLER INTERVAL BECAUSE THE TRADE OFF FOR HAVING A SMALLER AND MORE PRECISE RANGE OF VALUES IN WHICH WE THINK OUR PARAMETER LIES IS THAT WE CAN BE LESS SURE THAT WE WOULD CAPTURE IT IN SUBSEQUENT SAMPLES
Sampling error vs. margin of error
Sampling Error: we can reasonably assume that there will always be at least some gap between our sample statistics and their respective population parameters
Margin of Error:m estimate of the AMOUNT OF DIFFERENCE that we think is possible between our statistic and its corresponding parameter
Sampling theory and sample size
SAMPLING THEORY: the larger the sample size, the LESS amount of sampling error we expect there to be = smaller margin of error (to know if this is true, we would need to know the population parameter)
CLT
states that with repeated samples, the sampling distribution will eventually become approximately normal and the mean of all samples will approximate the mean of the population
CLT and increasing sample size
INCREASING SAMPLE SIZE: HAS THE SAME EFFECT AS REPEATING SAMPLES OVER AND OVER AGAIN, IT REDUCES THE MARGIN OF ERROR
AS SAMPLE SIZE INCREASES, THE MARGIN OF ERROR VALUE BECOMES SMALLER AND THE CONFIDENCE INTERVAL BECOMES MORE NARROW
Two assumptions when using confidence intervals
First assumption: is that you have used simple random sampling
Second assumption: is that we have a normal probability distribution; this is crucial because confidence intervals rely on the Central Limit Theorum in order to make an interval estimate
When to use t-distribution?
Developed the t-distribution when normal distribution cannot work on small sample sizes: frequency distribution of standard deviations of samples drawn from a normal population
Properties of T-distribution
Unimodal and symmetrical about the mean
Tend to be flatter and with thicker tails than the norm distribution
Identified by degrees of freedom associated with them
Mean of 0 and SD is greater than 1
Value of the t-statistic ranges from negative to positive infinity
Shape is dependent on sample size
AS SAMPLE SIZE INCREASES: THE SHAPE APPROACHES THAT OF THE STANDARD NORMAL DISTRIBTUION
When df (DEGREES OF FREEDOM)=infinity, the T-distribution exactly matches the standard normal distribution
Degrees of Freedom:
the amount the final calculations are allowed to vary; equal to the sample size minus 1
Degrees of freedom are generally equal to the number of observations we used to estimate the parameter minus the number of intermediate parameters we need to estimate
Effect of sample size on t-distribution
As sample size increases, the t-distributions BECOMES MORE NORMAL
N is equal to or greater than 30, the sampling distribution so closely approximates the normal distribution, that we could use the normal distribution instead
^^THIS IS BECAUSE OF THE CENTRAL LIMIT THEORUM: AS SAMPLE SIZE INCREASES DISTRIBUTION BECOMES MORE NORMAL
When are t vs. z stats used?
T-Statistics are used for comparisons when n<30, and z-stattistics can be used when n is > or equal to 30
Standardized Test Statistic:
enables you to take a score from a sample and transform it into a standardized form very similiar to a z-score
Why is a smaller confidence interval better?
more accurate; A large confidence interval suggests that the sample does not provide a precise representation of the population mean, whereas a narrow confidence interval demonstrates a greater degree of precision.
Effects of n increasing on Confidence level
confidence level increases as n increases
Basic principles of confidence intervals
As n increases, so does confidence level
As n increases, confidence interval becomes more precise and becomes narrower
The narrower the CI, the better and more precise
As confidence level increases, confidence interval increases and becomes less precise
As n increases, the maximum possible distance our stat is from the population parameter decreases, which means the confidence interval becomes narrower and more precise
C
area under the standard normal curve between the critical values
What effect does increasing degrees of freedom have on the t-distribution?
after 29df, the t distribution is very close to the standard normal distribution
GREATER DF=CLOSER TO NORMAL
Relationship between df and t
Inverse; as t increases, degree of freedoms decrease
As t decreases, degree of freedom increases