Basic stats Flashcards

0
Q
  1. What is a random variable (RV)?
A

A RV represents a numerical value associated with each outcome of a probability experiment. “X” is determined by chance (random). Value of a RV is subject to some form of uncertainty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q
  1. What is the law of large numbers?
A

As an experiment is repeated over and over, the empirical probability of an event approaches the theoretical (actual) probability of the event.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. What is the “expected value” of a random variable?
A

The mean of the random variable in an infinite number of repetitions of the experiment (samples). For a discrete RV: Σ xP(x). The sum of all values of x multiplied by each x’s probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Name three discrete probability functions.
A
  1. Poisson
  2. Binomial
  3. Bernoulli
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Name four continuous probability distribution functions.
A
  1. Normal (Gaussian)
  2. Exponential
  3. Gamma
  4. Uniform
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. What is the defining way in which calculating probability distribution functions differ for discrete and continuous variables?
A

The probability distribution for discrete variables contains info about ALL the statistical properties of X, for example, once the probability distribution is known, the expectation of any function of X can be calculated (expected value, variance, standard deviation).

There is an infinite continuum of possible values values for x for a continuous RV “X”. The probability of X being exactly equal to a particular value is zero: Pr{X = x} = 0. So instead the probability distribution of a continuous variable is defined by the probability of a RV being less than or equal to a particular value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. What are the five properties of a normal distribution (curve)?
A
  1. Mean = median = mode
  2. Bell-shaped and symmetric about the mean.
  3. Total area under the curve is equal to one.
  4. Curve approaches, but never touches, the x-axis as extends further and further away from the mean.
  5. μ - σ and μ + σ are the inflection points.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. What is a standard normal distribution?
A

Normal distribution with mean of 0 and standard deviation of 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. What is the “Central Limit Theorem”?
A
  1. If samples of size n, where n >= 30, are drawn from any population with a mean μ and a standard deviation σ, then the sampling distribution of sample means approximates a normal distribution. The greater the sample size, the better the approximation.
  2. If the population itself is normally distributed, the sampling distribution of sample means is normally distributed for ANY sample size n.

mean: μ(x-bar) = μ
variance: σ^2(x-bar) = σ^2/n
s. d.: σ(x-bar) = σ/n^0.5 (also known as “standard error of the mean”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Validity of an estimation method is increased if a sample statistic is ________ and _________. Define each.
A
  1. Unbiased: Statistic doesn’t over- or underestimate the population parameter. The mean of all possible sample means of the same size equals the population mean. As a result, x-bar is an unbiased estimator of μ.
  2. “has low variability”: When the standard error, σ/n^0.5, of a sample mean is decreased by increasing n, it becomes less variable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. What is the standard error of the mean (SEM) and what is it measuring?
A

SEM: The standard deviation of the sampling distribution of the sample means: σ(x-bar) = σ/n^0.5
SEM quantifies the precision of the mean; it is a measure of how far your sample mean is likely to be from the true population mean. SEM is expressed in the same units as the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. What are the differences and similarities of the standard deviation (SD) and the standard error of the mean (SEM)?
A
  1. SD: quantifies scatter - how much values vary from one another.
  2. SEM: quantifies how precisely the true mean of the population is known; it takes into account both the SD and the sample size.
  3. Both SD and SEM are in the same units- the units of the data.
  4. By definition, the SEM is always smaller than the SD.
  5. The SEM gets smaller as the sample size increases (precision increases).
  6. The SD does not change predictably as the sample size increases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. Describe the t-distribution and when it is used.
A

There are many t-distributions; the particular form the t-distribution is determined by its degrees of freedom (df = n -1) –> sample size minus one. The more degrees of freedom the closer a t-distribution is to a normal distribution; w/ infinite df, t distribution is the same as the standard normal distribution.

t = [x-bar - μ] / [s / n^0.5], where x-bar is the sample mean, μ is the population mean, s is the standard deviation of the sample, n is the sample size.

The t-distribution is used when sample sizes are small and/or the SD of the population is unknown. The t-distribution can be used w/ any statistic having a bell-shaped distribution (approximately normal).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly