Review Material Flashcards
Sample mean
X-bar
X-bar = €x/n
€=sum of
n=sample size of variable x
Population mean
Mu (u) = €x/N
What is the median and how do you find it?
Median is the middle value
- Sort data
- Find the middle value
- If even numbers in observation, average the middle two terms
What is the mode?
The most frequent value
What are some facts about standard deviation?
Population standard deviation = sigma - no symbol for it = square root of variance
In normal distribution:
68% of data within 1 SD of mean
95% of data within 2 SD of mean
99.7% of data within 3 SD of mean
What is square root of variance equation?
S= sqr[(€(x-xbar)^2)/(n-1)]
Xbar = sample mean X = variable n = sample size
This calculation arrives at standard deviation (s)
Scatter diagram
Relationship between two variables
Box plot
Graphical display based on quartiles
Histogram
Frequency for each class of measured data
Ha or H1 in one tailed alternative
Alternative hypothesis -
-One tailed alternative states direction
Right tail mu (u) > number
Left tail u < number
Two tailed alternative
- population mean not equal to number or fraction
- a test is two tailed when no direction is specified in the alternative hypothesis
When do you reject the null hypothesis? Ho? (H-not)
-absolute value of test statistic > critical value
- reject Ho if | z-value | > critical z
- reject Ho if | t-value | > critical t
• reject Ho if p-value < significance level (inequality is reversed)
Test statistic
When testing for the population mean from a large sample and the population standard deviation is known. The test statistic is given by:
z= (xbar - u)/ (sigma/sqr(n))
Type I error
Alpha (symbol not shown)
P(type I error) = significance level = probability that you reject the true null hypothesis
Type II error
Beta = ß = P(type II error) = probability you do not reject null hypothesis, given Ho is false
Confidence interval (CI)
A range of values within which the population parameter is expected to occur.
Factors in determining a CI:
- The sample size
- The variability in the population, usually estimated by the SD
- The desired level if confidence
Use normal distribution (z table) if population standard deviation (sigma) is know a and either:
- Normal population
- Sample size > 30
Equation not shown here in white book
CI : mean t- distribution
Use for normal distribution when standard deviation is NOT known
- if given sample standard deviation (s), use t-table assuming normal population
- if one population, n-1 degrees freedom
CI : proportion
- Use if success or failure
- normal approximation to binomial ok if (n)(pi) > 5 and (n)(1-pi) > 5, where n= sample size, pi = population proportion
Equation in white book
Wide confidence interval if:
- Small sample size
- Large standard deviation
- High confidence interval
*if want narrow interval, need large sample size or small standard deviation or low confidence level
What is a simple linear regression?
One independent variable, one dependent variable
Y=mx+b
Y - dependent variable
X- independent variable
What is the coefficient of determination?
R^2 = % of total variation in y that can be explained by variation in x
-measure of how close the linear regression line fits the points on a scatter diagram
R^2=1 perfect linear relationship
R^2=0 no linear relationship
R is?
Correlation coefficient
What is an expected value?
E(x) = sum of x * p(x)
-is a weighted average, also a long run average
Characteristics of binomial distribution
- can result in one of two outcomes
- is discrete (integer values) 0,1,2,n
- random variable (x) is the number of successes in n trials
- each trail is success or failure
- independent trails
- constant probability
Characteristics of normal distribution..
Continuous, bell shaped, symmetric
- mean = median= mode
- cumulative probability under normal curve : use Z table if know pop. Mean (mu) and pop. Standard deviation
- sample mean: use Z table if know pop. Standard deviation and either normal dist. Or n>30
Characteristics of t- distribution
- continuos, mound shaped, symmetric
- applications similar to normal
- more spread out than normal
- use t if normal population but pop. Standard deviation is NOT known
- degrees of freedom =df=n-1 if estimating the mean population as one
What is P-value?
The probability of getting a sample statistic as extreme (or more extreme) than the sample statistic you get from your sample given that the null hypothesis is true.
How to use the p-value…
Reject Ho if p-value is less than significance level
When there is no variation—
There is certainty, exact prediction, standard deviation =0
Variation =0
High variation means that…
Uncertainty, unpredictable, high deviation
What is standard error of the mean?
Is the standard deviation of sample mean = standard deviation/square root of n
- as n increases, standard error decreases
What is a sampling distribution?
-Expected value of sample mean=population mean, but an individual sample mean could be smaller or larger than the population mean
A distribution of sample means
- it’s a random variable (population mean is a constant parameter)
Central limit theorem (CLT)
- if population standard deviation is known, sampling distribution of sample means is normal (n>30)
- CLT applies even if original population is skewd