Week 2| Desirable probabilities of point estimators and parametic and on parametric techniques the assumptions assumption of normality Flashcards

Question 1

Q

Regarding the four property that can make point estimators easier to work with and possessed by good point estimators.
What is the first property?

Answer

A

Beta- hat is said to be a linear estimator of Beta if it is a linear function of the
sample observations.

E.g the sample mean X-bar is a linear estimator of the population mean mu
Sample variance s^2 is a quadratic function of the X_i sample observations so it is a non linear estimator of the population variance

Question 2

Q

What is the second property that can make point estimators easier to work with an are possessed by ‘good’ point estimators?
When beta_1 hat is centered around beta while beta_2 hat isn’t centered around beta. Which one is a more unbiased estimator?

Answer

A

b) Beta-hat is said to be an unbiased estimator of beta if E(beta\hat) = beta

if expected value of beta-hat is equal is beta and thus sampling distribution of beta-hat is centered around beta

when the sampling distribution of beta-hat is centered around beta while the sampling distribution of beta_2 hat is not
Beta_1 is an unbiased estimator
whereas Beta-2 hat is a biased estimator
Beta_1 can estimate beta more accurately than beta_2 hat

Question 3

Q

Whatis the third property that makes a point estimator easier to work with and are possessed by good point estimators? What about the variances?

Answer

A

Beta-hat is an efficient estimator of Beta within some well defined class of estimators 
If variance is smaller or at least not greater than that of any other estimator of Beta in the same class of estimators.

Question 4

Q

What does BLUE stand for to the population means?

Answer

A

BLUE stands for
Best 
Linear 
Unbiased 
Estimator 
of the population mean
best means x-bar has the smallest variance in the class of linear unbiased estimators of mu, hence it is an efficient estimator

Question 5

Q

what is the fourth property of having a estimator easier to work with and a property that a good point estimator will have?

Answer

A

Beta-hat is called a consistent estimator of beta if its sampling distribution collapses into a vertical straight line at the point Beta when the sample size n goes to infinity

As sampling distributions are centered around beta and as the sample size increases, they become narrower

If Beta hat is an unbiased estimator then consistency requires the variance of its sampling distribution to go to zero for increasing n.
For example, X-bar is a consistent estimator of mu

However, if Beta-hat is a biased estimator then consistency requires its variance and the bias to go to zero for increasing n

Question 6

Q

What are parametric techniques concerned with?

Answer

A

They are concerned with:

a) population parameters and
b) are based on certain assumptions about the sampled population or about the sampling distribution of some point estimator

Question 7

Q

What does the parametric technqieu assume? (the requirements)

Answer

A

i. The sample has been randomly selected
ii. The variable of interest is quantitative and continuous
iii. is measured on a ratio or interval scale

Question 8

Q

What is a non parametric test?

Answer

A

Procedures that are either not concerned with some population parameter or
based on relatively weaker assumptions than their parametric counterparts, and hence require less information about the sampled population

Question 9

Q

What are some ways we can check normality?
name the graphs for frst method
the 4 techniques for second (what conditions makes them normal or not)
the 3 methods for the last

Answer

A

i) visually
Using histogram or Q-Q plot
If the histogram is skewed - not normal
if the points on the Q-Q- plot are scattered around the straight line- not normal

ii) sample statistics:
1. mean & 2. median
if mean > median- right skewed
if mean 0

Kurtosis
A distribution whose tails are relatively long and thus has more outliers is leptokurtic (thin graph centred around beta but long tails, lepto is for thin, fine)

A distribution whose tails are relatively short are thus has fewer outliers is called platykurtic (platus is for broad , flat)
K=3 symmetric
K>3 for leptokurtic
K<3 platykurtic

iii) Testing with normality
Shapiro Wilk test
H_0: data comes from normally distributed population
H_A: the data comes from a non-normally distributed population

iii) formal hypothesis tests

Question 10

Q

What are the 2 shortcomings of the shapiro wilk test?

Answer

A

i) At small sample sizes (n<20) when normality assumption can be crucial, it has little power to reject H_0 even if population is indeed not normally distributed (Type II error)
ii) At large sample (n>100) when violation of normality is far less critical in practice. It becomes too sensitive to the slightest signs of non-normality in the sample and often rejects H_0 even if it is actually true

Question 11

Q

What does it mean if the SK value in the R printout is positive?

Answer

A

sample of diff is skewed to the right

if skew.2SE >1 then the distribution of diff is unlikely normal
K-hat 3 = kurtosis

Question 12

Q

For quantitative data there are two most useful and popular measures of central location. What advantages do each of these measurements have?

Answer

A

mean adv:

comprehensive measure because it is computed from all available datapoints, median is only based at most 2 data points
the mean is used far more extensively in inferential statistics than the median

Median adv:

median depends on only middle values, robust to outliers, mean is unduly influenced by outliers
median exists even if the measurement scale is ordinal but mean does not

Question 13

Q

When should one use the nonparametric test?

Answer

A

When mean doesnt exist- or not ideal measure of the population due to outliers

T-test becomes inappropriate because normality assumption is violated

Question 14

Q

What two alternative non parametric test are there?

name thwir requirements, hypotheses

Answer

A

(one sample) Signed test for the median

i. The data is a random sample of independent observations
ii. The variable of interest is qualitative or quantitative
iii. The measurement scale is at least ordinal

does not assume anything about the distribution of the sampled population

Hypotheses:
H_0: n=n_0 vs H_A : nn_0, n not equal to n_0

(One sample) Wilcoxon signed ranks test for the median (n)
Also known as Wilcoxon signed rank sum test. The sign test is based on entirely the signs of the deviations from n_0.
Wilcox signed ranked tests is a more sensitive and powerful alternative bc it takes into account the magnitudes of the deviations into considerations

i. Data is a random sample of independent observations
ii. The variable of interest is quantitative and continuous
iii. The measurement scale is interval or ratio
iv. The distribution of the sampled population is symmetric (mu = n)

Hypotheses:
H_0: n = 10 , H_A : n>10

Question 15

Q

What conclusions woudl you reject null hypothesis for the sign test and wilcoxon signed rank sum test

Answer

A

sign test:
p-value < alpha - reject null hypothesis

Wilcoxon ranked sum sign test:
p-value < alpha - reject null hypothesis

Week 2| Desirable probabilities of point estimators and parametic and on parametric techniques the assumptions assumption of normality Flashcards

(15 cards)