final exam Flashcards
What is probability
a mathematical function of an event in a sample space, quantifying the likelihood of that event occurring in accordance with specific axiomatic rules.
What is a random experiment
a well-defined procedure or action that produces an (observable) outcome in the sample space.
What is the sample space
the set of all possible outcomes from a random experiment
What is an outcome
a result of a random experiment
What is an event of a random experiment
A subset of the sample space or a set of outcomes in the sample space.
0 β€ π πΈ β€ 1
Probability of any event must lie between 0 and 1, inclusive
P(S) = 1
Probability that any of the outcomes in S occurs must be 1.
What properties must the sample space satisfy?
- The outcomes in a sample space must be βexhaustive.β
- The outcomes in a sample space must be βmutually exclusive.β
What does it mean for outcomes in S to be exhaustive?
β All possible outcomes must be listed/
β Each βtrialβ (or experiment) must result in one of these outcomes.
What does it mean for outcomes in S to be mutually exclusive?
β No two outcomes can occur at the same time (on the same βtrialβ).
P(E) = 1
πππ πππ¦ π πππ’ππππ ππ ππ’ππ‘π’ππππ¦ ππ₯ππ’π ππ£π ππ£πππ‘π πΈ1, πΈ2, β¦ , πΈπ, π πΈ1 βͺ πΈ2 β¦ βͺ πΈπ = the sum of π πΈπ
What does it mean for an experiment to be random?
An experiment whose outcome cannot be predicted, but the possible outcomes can be listed
Classical approach
P(E) = Number of possible outcomes in which E occurs
/ Total number of possible outcomes
(Assume outcomes are equally likely (flipping coins))
Relative Frequency Approach
P(E)= Number of trials in which E occurs/ Total number of trials.
(assign probabilities on the basis of data)
Subjective Approach
P(E) = your best guess
What does P(A|B) =/= P(B|A) mean?
P(A) =/= P(B)
What is a random variable
A random variable (X) is a real-valued function of an event or a set of
outcomes of a random experiment to a numerical value.
π (π = π₯π) β₯ 0 ππ π₯π β S
For all x values/ outcomes in the sample space, the probabilities must be positive.
Sum of β π( π = π₯π) = 1 ππ π₯π β S
The sum of all probabilities of the possible x values in the sample space must be 1
π (π = π₯) = 0 πππ π₯ β π
x values not included in the sample space can not occur
What is a test of independence?
P(A|B) = P(A)
What is the test for mutually exclusive?
P(A and B) = 0
What is a random variable?
a real valued function of an event or a set of outcomes of a random experiment
Whatβs the difference between a continuous and discrete random variable.
Continuous: can take on any real value within an interval.
Discrete: can take on a countable number of possible values.
is f(x) a probability?
no, but the area under f(x) can be interpreted as one.
What are the properties of a continuous random variable?
1) π x β₯ 0
2) The area under π x can be interpreted as a probability
- e.g.,) P(100 < X < 120)
- π x itself is not a probability (π π = π₯ = 0 πππ any x )
3) The total area under π x is 1
What is the sampling distribution?
the probability distribution of a sample statistic for a given
sample size (N)
What is the benefit of using a sampling distribution?
allows us to make statistical inferences about population parameters
using a sample.
β to quantify the uncertainty or margin of error of our estimates.
β to form the basis for conducting statistical tests on population
parameters.
What is random sampling?
a process of selecting a subset of cases from a
larger population at random.
β Each case in the population has a known probability of being selected to be
part of the sample.
When can random sampling be called simple random sampling?
In a simple random sampling or SRS, each case has an equal chance of being
selected.
What are the advantages of using SRS
1) It increases the likelihood that the sample is representative of the population.
β What if the sample size is extremely large?
2) It allows us to establish the probability distribution of a sample statistic.
Why can we treat a sample statistic as a random variable in random sampling?
The data from the sample go through a function to output a numerical value of the sample statistic, and depending on the sample the value will be different, so it will have a probability distribution.
What is the CLT?
When a sample is collected through random sampling, and N is sufficiently large, x approximately follows normal distribution regardless of the population mean. X will get closer and closer to mu.
What is the difference between point and interval estimators
A point estimator estimates a single value of a population parameter and an interval estimator estimates a range of values where the true population parameter could fall in certain probabilities
What is bias?
systematic error inherent in the estimate itself
what is sampling error?
random errors occurring as the parameter is based on a sample. Unavoidable when not evaluating the population as a whole.
When is the estimator unbiased?
If the expected value of a parameter is equal to the true value of the parameter
If X is normal, then estimated x follows normal even if the sample is not equal or greater than 30
True
What can make the CI wider?
reducing a, increasing confidence level, reducing N, when pop sd is larger
What is SE in a t-test?
Sx/ sqrt(π)
What is SE in a z-test?
Ο / sqrt(N)
What is MOE for a 95% CI in a Z- Test?
1.96(SE)
How do you find the CI?
(X - MOE, X + MOE)
How do you find Z?
X - u = π/ sqrt(N)
What is the a level?
the chosen threshold for saying that the probability of a
test-statistic at least as extreme as the observed one is small enough to reject H0.
What should you include in the statistical test?
1) type of test, 2) research hypothesis, 3)description of procedure w/ summary stats (sample mean, pop mean, pop sd), 4) reason for decision, 5)statistical conclusion, 6) conclusion/ implication, 6) relevant stats (z/t & p)
What is Null Hypothesis Significance Testing?
a statistical procedure that
determines whether there is enough evidence to support a research hypothesis
about population parameter(s)
When do you use pnorm(z, lower.tail = FALSE)
when z is positive
when do you use 2 * pnorm(z, lower.tail = FALSE)
when Ha is nondirectional(two-tailed)
What does it mean that Ho and Ha must be mutually exclusive and exhaustive?
Either Ho is false or Ho is true
What is the p value
The p-value is the chance of observing a result (e.g., a mean) as extreme as, or more
extreme than, your result (e.g., your value of π), assuming that H0 is true.
β A low p-value is evidence against H0, but it is not the probability of H0.
When do you use a/2?
When you are finding the z value or t value w/ qnorm or qt and the test is two tailed
The critical z value(s) for a 1-sample z test when the alternative hypothesis is that ΞΌ > ΞΌ0 and the significance level is .01.
qnorm(.01, lower.tail = FALSE)
The critical z value(s) for a 1-sample z test when the alternative hypothesis is that ΞΌ β ΞΌ0 and the significance level is .01.
> qnorm(0.005)
[1] -2.575829
qnorm(0.005, lower.tail = FALSE)
[1] 2.575829
The p-value for a 1-sample z test when the alternative hypothesis is that ΞΌ > ΞΌ0 and the observed z score is 1.40.
> pnorm(1.40, lower.tail = FALSE)
The p-value for a 1-sample z test when the alternative hypothesis is that ΞΌ β ΞΌ0 and the observed z score is β2.20.
2 * pnorm(-2.20)
How do you find p for a two tailed test if you already have p for a one tailed test?
p * 2
The critical t value(s) for a 1-sample t test when the alternative hypothesis is that ΞΌ < ΞΌ0, the sample size is 52, and the significance level is .05. You may use r
round(qt(.05, df = 51), 3)
The critical t value(s) for a 1-sample t test when the alternative hypothesis is that ΞΌ β ΞΌ0, the sample size is 80, and Ξ± = .01.
round(qt(.005, df = 79), 3)
The p-value for a 1-sample t test when the alternative hypothesis is that ΞΌ < ΞΌ0, the sample size is 39, and the observed t score is β2.55.
round(pt(-2.55, df = 38), 3)
The p-value for a 1-sample t test when the alternative hypothesis is that ΞΌ β ΞΌ0, the sample size is 75, and the observed t score is 1.82.
pt(1.82, df = 74, lower.tail = FALSE) *2
How to you find a 95% CI for a t-test?
[X - MOE , X + MOE]
What is the MOE for a t- test?
2(SE)
What is SE for t?
Sx/ sqrt(N)
How do you find t?
X - u / (Sx/ sqrt(N))
r formula for finding p for on tailed t- test:
pt(t, df = N-1)