(10) Sampling and Estimation Flashcards

Question 1

Q

LOS 11. a: Define simple random sampling and a sampling distribution.

Answer

A

Simple random sampling is a method of selecting a sample in a way that each item or person in the population being studied has the same probability of being included in the sample. Each number is chosen using either of the following methods: random number generator or selecting every k^th element

Question 2

Q

LOS 11. a: Define simple random sampling and a sampling distribution.

Answer

A

A sampling distribution is the distribution of all values that a sample statistic can take on when computed from samples of identical size randomly drawn from the same population.

Question 3

Q

LOS 11. b: Explain sampling error.

Answer

A

Sampling error is the difference between a sample statistic and its corresponding population parameter (e.g., the sample mean minus the population mean).

Question 4

Q

LOS 11. c: Distinguish between simple random and stratified random sampling.

Also, what are the steps to create a stratified random sampling?

Answer

A

Stratified random sampling involves randomly selecting samples proportionally from subgroups that are formed based on one or more distinguishing characteristics, so that the sample will have the same distribution of these characteristics as the overall population.

Stratified random sampling reduces sampling error

Step 1: population is divided into sub-populations

Step 2: Simple random samples are dranw from each strata in proportion to their size

Question 5

Q

LOS 11. d: Distinguish between time-series and cross-sectional data.

Answer

A

Time-series data consists of observations taken at specific and equally spaced points in time. This is only for one observational unit

Ex of time series: ABC daily stock prices

Cross-sectional data consists of observations taken at a single point in time. This includes many observational units.

Ex of cross-sectional: Free cash flow/ debt ratio for U.S Industrials

Question 6

Q

LOS 11. e: Explain the central limit theorem and its importance.

Answer

A

The central limit theorem states that for a population with a mean µ and a finite variance σ², the sampling distribution of the sample mean for all possible samples of size n (for n >= 30) will be approximately normally distributed with a mean equal to µ and a variance equal to σ²/n.

Question 7

Q

LOS 11. f: Calculate and interpret the standard error of the sample mean.

Answer

A

The standard error of the sample mean is the standard deviation of the distribution of the sample means and is calculated as:

σ_Xbar = s/(n^1/2), where σ, the population standard deviation, is known

s_x = s/(n^1/2), where s, the sample standard deviation, is used because the population standard deviation is unknown.

As n increases, SE will decrease

Question 8

Q

LOS 11. g: Identify and describe desirable properties of an estimator.

Answer

A

Desirable statistic properties of an estimator include:

Unbiasedness (sign of estimation error is random; the expected value of the estimator equals the parameter being estimated),
Efficiency (lower sampling error than any other unbiased estimator)
Consistency (variance of sampling error decreases and mean increases with sample size increases).

Question 9

Q

LOS 11. h: Distinguish between a point estimate and a confidence interval estimate of a population parameter.

Answer

A

Point estimates are single value estimates of population parameters. An estimator is a formula used to compute a point estimate.

Formula is Sample mean + or - (reliability factor x standard error); where reliability factor is Z(a/2)

Z (a/2) = 1.65 for 90% CI; 1.96 for 95% CI; 2.58 for 99% CI

Question 10

Q

LOS 11. h: Distinguish between a point estimate and a confidence interval estimate of a population parameter.

Answer

A

A range within which we can assert, with probability of 1 - a, the degree of confidence that the range will contain the parameter.

Question 11

Q

LOS 11. h: Distinguish between a point estimate and a confidence interval estimate of a population parameter. The reliability factor.

Answer

A

The reliability factor is a number that depends on the sampling distribution of the point estimate and the probability that the point estimate falls on the confidence interval.

Question 12

Q

LOS 11. i: Describe properties of Student’s t-distribution and calculate and interpret its degrees of freedom.

Answer

A

Use this when the following is present: Sample less than 30 and normal distribution with unknown variance;

Defined by a single parameter => degrees of freedom = n - 1

Lower peak than normal, fatter tails

Degrees of freedom for the t-distirbution are equal to n-1. Student’s t-distribution is closer to the normal distribution when df is greater, and confidence intervals are narrower when df is greater.

Question 13

Q

LOS 11. j: Calculate and interpret a confidence interval for a population mean, given a normal distribution with 1) a known population variance, 2) an unknown population variance, or 3) an unknown variance and a large sample size.

Answer

A

For a normally distributed population, a confidence interval for its mean can be constructed using a z-statistic when variance is known, and a t-statistic whne the variance is unknown. The z-statistic is acceptable in the case of a normal population with an unknown variance if the sample size is large (30+).

Question 14

Q

LOS 11. j: Calculate and interpret a confidence interval for a population mean, given a normal distribution with 1) a known population variance, 2) an unknown population variance, or 3) an unknown variance and a large sample size. Chart.

Question 15

Q

LOS 11. k: Describe the issues regarding selection of the appropriate sample size, data-mining bias, sample selection bias, survivorship bias, look-ahead bias, and time-period bias.

Answer

A

Increasing the sample size will generally improve parameter estimates and narrow confidence intervals. The cost of more data must be weighted against these benefits, and adding data that is not generated by the same distribution will not necessarily improve accuracy or narrow confidence intervals.

Question 16

Q

LOS 11. k: Describe the issues regarding selection of the appropriate sample size, data-mining bias, sample selection bias, survivorship bias, look-ahead bias, and time-period bias.

Answer

Study These Flashcards

A

The practice of hitting a data set over and over again until you hit gold. Typically, not motivated by a theory (hypothesis). Result of data narrowing. Data mining (significant relationships that have occurred by chance),

Question 17

Q

LOS 11. k: Describe the issues regarding selection of the appropriate sample size, data-mining bias, sample selection bias, survivorship bias, look-ahead bias, and time-period bias.

Answer

Study These Flashcards

A

The exclusion of certain data/variables due to unavailability (This makes it non-random)

Question 18

Q

LOS 11. k: Describe the issues regarding selection of the appropriate sample size, data-mining bias, sample selection bias, survivorship bias, look-ahead bias, and time-period bias.

Answer

Study These Flashcards

A

survivorship bias (using only surviving mutual funds, hedge funds, ect.),

Question 19

Q

LOS 11. k: Describe the issues regarding selection of the appropriate sample size, data-mining bias, sample selection bias, survivorship bias, look-ahead bias, and time-period bias.

Answer

Study These Flashcards

A

look-ahead bias (basing the test at a point in time on data not available at that time)

Mismatch between the timing of observations among variables (i.e. stock prices/returns vs accounting data)

Question 20

Q

LOS 11. k: Describe the issues regarding selection of the appropriate sample size, data-mining bias, sample selection bias, survivorship bias, look-ahead bias, and time-period bias.