Sampling theory Flashcards

1
Q

What are 3 tasks in scientific research for which statistics is useful?

A
  1. Estimating parameters: population parameters or model fitting/system identification where the parameters of the model are estimated from the results of a series of experiments.
  2. Experimental design: how to minimize mesurment error due to bias and inaccuracy, Comparative experiments: how to design experiments to mea-
    sure the comparative performance of different individuals, Factorial experiments: the design of experiments where the vari-
    able of interest depends on several different factors.
  3. Quality control: Acceptance sampling: monitoring the quality of items by testing
    small samples. Process control simple view of keeping a continous processat a specified level.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is sampling theory? and what is it useful for?

A

Decribes the relationship between a sample and the population that the sample represents.

Samples are used to estimate unknown population parameters from info contained only in the sample data with suitable statistics.

Useful for comparing two populations
by comparing samples from those populations. This is done using hypothesis
testing, where one asks if the differences between two samples are likely to
represent differences in the underlying populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sampling theory charachteristic?

A

If we take a sample from a population with n random variables X1, X2 … Xn that has a joint density/mass probability function, it is one of the different possible ones which has a certain probability of occurring.

Calculating this probability is too complicated if we don’t assume that the sampling is independent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is sampling considered indipendent?

A

Sampling is said to be independent if the random variables X1, X2, . . . , Xn are independent and identically distributed (i.i.d.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three types of random sampling?

A
  1. Sampling from an infinite population (the population remains the same after any individual is drawn from it), in this case, the random variables representing the sample are i.i.d.
  2. Sampling with replacement from a finite population (the population remains the same after any individual is drawn from it) in this case the random variables representing the sample are i.i.d.
  3. Sampling without replacement from a finite population, the population changes therefore the random variables are not independent. Remember formula.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When can a finite and without replacement population be modeled finite?

A

When it is large enough (typically at least a few hundred, and preferably in the thousands) sampling without replacement can be modeled
to a very good approximation by independent random variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When the data obtained from sampling could be analyzed using time series analysis but not by the analytical methods covered in this course?

A

When between data there is sequential correlation of values sampled at different time intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
A

The sample is modeled as a collection of random variables with certain parameters e.g. sample mean. All the different samples from the population can be modeled with a random variable x̄ which has a probability distribution itself called the sampling distribution. This model is appropriate since
we expect the sample mean to depend on the data values in our given sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Do we know the parameters of the underlying probability model we are sampling from?

A

No, that is why we want to estimate population parameters from sampling distributions and therefore statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a sample statistic S?

A

Is a function of the random variable of the sample Xi calculated from a sample of data taken from a larger population that has its own distribution

These statistics provide information about the characteristics of the sample, which can then be used to make inferences or draw conclusions about the population as a whole. They are an estimate for a population parameter p which is usually the mean, variance, and proportion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an unbiased (representative) estimator? When is the sample statistic S considered an unbiased estimate? how do you calculate the bias?

A

An estimate that on average gives the correct value.

It is an unbiased estimate of p if its expected value E(S) is equal
to p.

The bias of S = E(S) - p.

The bias has to be low because we want our estimator to have small variance so that the probability of estimating p incorrectly is small.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an empirical average? why we calculate it?

A

Is the average or mean value calculated directly from observed data in a sample. It is the actual arithmetic average of the values in the sample.

Because we cannot calculate E(S) directly from a single sample because it requires knowledge of the underlying population parameters, which are usually unknown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the sample mean, the variance of the sample mean and the standard deviation of the sample mean?

A
  • x̄ is the sample mean, x̄ = 1/n * ∑ i Xi, and is an unbiased estimator of the population mean since E(x̄) = E(x)=µ.
  • The standard deviation of the sample mean is σ/√n. (The factor n in the denominator implies that the precision of our estimate is greater with a larger sample, but to have a twice more accurate estimate we need four times more data)
  • The variance of the sample mean x̄ is Var(x̄) = n Var(X)/n^2 = σ^2/n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When do you have an MVUE (Minimal Variance Unbiased Estimator)?

A

When the population has a normal distribution then the sample mean is the MVUE of the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the central Limit Theorem state? what are the 2 rule of thumb?

A

That when the sample is large enough then the sampling distribution tends to follow a normal distribution with mean µ and variance Var(µ) = σ^2/n.

x̄∼N(μ, σ/√n)

B(n,p) ∼ N(np,√np-(1-p) ) for np>5, n(1-p)> 5, n>30

P(λ) ∼ N(λ, √λ) for λ>5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the law of large numbers?

A

if X is a random variable with finite mean and variance, then as the sample size increases n → ∞, then x̄∼N(μ, σ/√n)

If the sample mean doesn’t exist then it will not converge but fluctuate.

17
Q

What is convolution (definition)? the convolution of two independent variables X and Y where Z = X + Y with probability mass/density distribution fX and fY?

A

Is the integral of the product of the two functions after one is reflected on the y-axis and shifted. The integral is evaluated for all values of shift, producing the convolution function. The choice of which function is reflected and shifted before the integral does not change the integral result. Graphically, it expresses how the ‘shape’ of one function is modified by the other.

Fz(z) = fX* fY (* is the convolution product)

Fz(z) = ∑ of u fX(u) fY(z-u)
= ∑ of v fX(v-z) fY(v)

Fz(z) = ∫ fX(u) fY(z-u) du
= ∫ fY(v) fX(z-v) dv

THE DISTRIBUTION OF A SAMPLE MEAN IS A HORRIBLE CONVOLUTION

18
Q

What is the formula for the sample variance S^2?

A

s^2 = 1/n-1 * ∑ (x1- x̄)^2

s^2 = n/n-1 * (x̄^2 - x̄2)

19
Q

What is a proportion?

A

If we want to estimate the probability of an event A given a sample size of size n and a is the number of data points that belong to A. p^ =a/n is an unbiased estimate of a p=P(A)