sampling distribution and estimation Flashcards
describe the sample mean ?
it is an estimate
a random variable drawn from the population
name the different types of distribution ?
- normal
- binomial
- chi squared
- f distribution
- t distribution
describe a normally distributed sample
- sample size of n
- sample mean xbar = sum of x / n
- variance = sigma(sd)^2 / n
- population mean is mew
- z transformation = sample mean - popmean / st,e
- standard error = root ( sigma^2 /n)
what is central limit theorem ?
the theory that as long as n > 25, you can use the methods you would use on normally distributed data to analyse it.
what is ?
a. an unbiased estimator ?
b. a biased estimator ?
a. an estimator ( for the mean ) that = the true unknown parameter
b. doesnt equal the true value
(mean strays from the centre of the distribution)
for a normally distributed sample, how do you estimate the population variance ?
the sample variance is an unbiased estimator so we calculate this:
s^2 = sum of ( xi - sampmean)^2 / n - 1 = sigma^2
what is high versus low precision ?
high precision = low variance
low precision = high variance
the steeper the distribution the lower the variance and the more preferred
do we prefer data to be precise or unbiased ?
generally we prefer unbiased data
for a sample with either N>25 or norm dist, how do you construct a confidence interval ?
first find the critical values for the size interval you need
transform these into critical values of the sampling mean
c L = sample mean - critval . sigma/root n
c U = “ but + not -
for a sample with either N>25 or norm dis, how do you estimate a population proportion ?
- sample size n
- sample prop p
- r no of successes
- pop prop = pie
we us binomial because there are only two options for outcomes, there fore the sample size must be > 25
p = r/n p = mean variance = pie (1-pie) / n
cU/L = p +/- Zcv.standardd (which is root( (p.(1-p)) / n)
what is the t distribution ?
it is very similar to normal distribution but it has one parameter V - degrees of freedom
how do you calculate confidence intervals or anything with t distribution ?
you calculated the same and normal but with different cvs.
v = n-1
t = sample mean - population mean / root (sample variance /n )
how do you conduct a hypothesis test ?
1) formulate the null and alternative hypotheses
2) chose the level of significance of the test
3) look up the correct critical values and set rejection region
4) calculate the test statistic
5) compares test stat to re region and make decision about hypotheses acceptance
how do you calculate the test statistic for
a. std normal
b. t dist
a. z = x - population mean / standard deviation
b. t = sample mean - population mean / root (sample variance /n )
what is type || error probability ?
the probability that we fail to reject the null hypothesis even when its true
the prob is equal to the area under the H1 is true distribution, to the left of the upper cv of the H0 is true distribution
what is the power if a test ?
the probability of rejecting the null when its false = 1 - B
the more powerful the test the better, so how do we increase the power ?
- increase the sample size ( less overlap - area is smaller)
- avoid situations where effect sizes is small ( effect size is the size of difference between actual and hypothesised means )
- smaller sample variance ( when possible but we normally don’t get the choice in econ )
what does statistically significant mean and does it mean economically important
it means that we reject the null hypothesis
no.
the difference can be small but if the sample size is large enough the difference would still be significant and yet it would not be statistically important.
how do you test a population proportion ?
always a Z test, we rely on the binomial approximation for a normal distribution.
therefore:MUST n.p > 5 , n. (1-P) > 5
find the significance level
critical values and set the rejection region. cv = std normal ones from Z table
test statistic Z* = P - pie 0 / (root ( pie (1 - pie)/n ))
how do you test the difference of two population means ?
same principle applies as before.
we only do the calculations when we know what the population variances are.
find the hypothesis which is always that the difference is not significant.
find the significance level
find the cv ( on the Z table ) and set the rejection region
calculate the test statistic and compare
test stat Z = ( sample mean1 -sm2 ) . (popmean1 - pm2) /
root( pv/n + pv/n)
what is the chi squared distribution ?
it is the distribution of the sum of squares of independent random variables.
where does the X^2 distribution lie ?
only in the posative domain
how many parameters does the X^2 distribution have ?
one, K degrees of freedom.
how do you calculate the…… for the X^2 distribution ?
a. sample variance
b. expected value ( mean )
c. variance
a. s^2 = the sum of ( (Xi - samplemean)^2 ) / n-1
b. k
c. 2k
as n reaches…….what….. does the x^2 distribution move to normal by central limit theory ?
infinity
how do we calculate a confidence interval for X^2 distribution ?
we can’t use a regular Ci because the interval is not symmetric as varience can’t be lower than 0
we transform S^2 so it has X^2 dist
(n-1) .s^2 / sigma^2
then we find the critical values
upper cv - one of the tails % in the t table
lower cv - 1 - one of the tails % in the t table
actual cvalues
lower (n-1) .s^2 / lowercv
upper (n-1) .s^2 / uppercv
this whole experiment only holds if the population is normally distributed
BUT IF THE SAMPLE SIZE IS > 100 X2 IS A VERY GOOD ESTIMATE SO WE USE NORMAL CV AND TRANSFORM THEM
0.5(zcv +/- root(2v -1 )^2
how many sides do X^2 hypothesis tests have ?
always only look at one, whether or not its two tailed or not.
how do you calculate a test statistic for a hypothesis test in a X^2 distribution ?
X^2 = sum of ( ( O - E )^2 ) / E
all E for categories must be %, if they aren’t you must combine categories until they are.
how do you conduct a test for independence ?
set up hypotheses chose significance level find the critical values calculate the test statistic = sum of ( ( O - E )^2 ) / E make a decision
need V for finding cvs
V = (r-1).(c-1)
what is the F distribution ?
the ration of two independent X^2 distributed random variables, both scaled by their degrees of freedom.
F = (x1/v1) / (x2/v2)
under the null hypothesis th population var is the same and as the two variables are normally distributed
f = S1^2 / s2^2
how do you conduct a variance ratio test ?
set up hypotheses - always that the pop vars are =
set up significance level
find critical value in F table we are only interested in one
calculate the test stat - f = S1^2 / s2^2
decision
how do you calculate the expected value of a section in the table for a chi squared test ?
( row total x column total ) / grand total