Week 7 - Probability and Sampling DIstribution Flashcards

1
Q

Define probability as Long-Run Relative Frequency

A

it is the proportion of times that a certain outcome would occur in a very long sequence of observations or repeated trials

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a random variable

A

A variable that takes a certain value or range of values by chance

eg a coin toss, whether we observe head or tail is a random variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

(1) Suppose Y is assigned a certain value. For each case or observation, the value of Y is fixed; however, the value of Y varies across cases or observations. Y may be called “a random variable.” True or false? Correct it if it is false and explain why.

A

FALSE: in this example, Y was already observed throughout samples.

once it is observed, it is no longer a random variable, its just a regular variable because its fixed in each observation

once we randomly sample variable Y from the population[, it becomes a random variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is probability distribution

A

it specifies the distributions of probabilities over the specific values of a random variable

eg if we roll a die, Y = 1,2,3,4,5, or 6

1/6 for each value of Y is the probability distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is Standard deviation

A

a parameter indicating the variability of the values of a random variable Y - variance denoted by sigma

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

name the 3 popular probability distributions

A
  • normal
  • bernoulli
  • student’s t distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the characteristics of a normal distribution

A

it is the probability distribution of a continuous random variable which is symmetric, bell shaped, and characterized by its mean μ and standard deviation σ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the bernoulli distribution characteristics

A

it is the probability distribution for a binary random variable: Y= 0 or 1

E(Y) or mean, denoted by μ is the probability of
variance (σ) is given by μ x (1-μ)

eg. if 20% of ppl chose disagree and 80% chose agree, it would be in terms of proportion on the graph, so μ=0.80

*** note that generally a probability distribution with a HIGHER μ will have lower variance - because the outcome is more predictable

μ with the highest variability in a bernoulli distribution will be 0.50 because it means a 50/50 chance of drawing Y=1 or 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the central limit theorem

A

when the sample size N is large, the sampling distribution or y-bar can be approximated by the normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is a sampling distribution

A

Summary statistics, such as sample means, computed from different random samples are different from each other. Hence, we can consider the distribution of summary statistics across repeated sampling. A sampling distribution is the name used to describe this distribution of summary statistics over repeated sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the difference between variance and standard deviation

A

Variance = how far the numbers are from the average (in squared units).

Standard Deviation = same idea, but back to original units (not squared).

so standard deviation tells us the variance, but its is the square root of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Standard error of the sampling distribution of sample means equals the population standard deviation divided by the number of observations in a sample. True or false? Correct it if it is false and explain why.

A

when we calculate SE, its always σ(population standard deviance) divided by the SQUARE ROOT of N which is number of observations in a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The sampling distribution of sample means (y-bar) can be approximated by a normal distribution as long as we take a random sample from the population and our sample includes a large number of observations. True or false? Correct it if it is false and explain why.

A

Answer: True

This is the Central Limit Theorem. The theorem applies regardless of the shape of the population distribution.

In the above figure, whatever the shape of the green distribution at the top, (the population distribution), the shape of the purple distribution at the bottom (the sampling distribution of y-bar) can be approximated by a normal distribution, as long as a random sample is taken and the number of observations in a sample is large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the three characteristics of the sapling distribution of y-bar that still holds true regardless of the shape of the population distribution

A

the centre, the variability, and the shape of the sampling distribution, as long as N is large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly