Probability and CLT Flashcards
What is the difference in descriptive and inferential statistics?
Descriptive statistics: A description of some collected data (sample); e.g. the average age of ..
Inferential statistics: What are the properties of a population? The observed data is assumed to be sampled from a larger population. Thus we need population estimates and hypothesis testing (we have uncertainty).
Why does inferential statistics need probabilities?
“If we know what a ‘random’ distribution looks like, we can tell random variation from non-random variation. Specific individual cases are
unpredictable, but they follow predictable laws in the aggregate. Once we learn to identify this ‘pattern of chance,’ we can confidently distinguish it from patterns that are not due to random phenomena.”
What does P(A) indicate?
Probability of A: is the proportion of elements from some set that satisfy the condition A. Mathematically, probability measures the size of a set in space Ω
e.g probability of rolling an even number = 3/6
How would you calculate the probability of finding a male students (80) in a psychology lecture hall? (400)
P(selected student = Male):
P(X = male) = Nmale / Ntot (set space)
80/400
=0.2
What are the two basic rules when calculating with probabilities?
Sum rule and product rule
What two basic concepts are there in probability theory?
Dependent vs independent probabilities and conditional probabilities
Explain the sum rule
P (A or B) = P (A) + P (B)
Probability of multiple events is the sum of the probabilities of each individual event.
Only if these events are mutual exclusive (cannot happen at the same time)
Example dices:
P (X = 1 or X = 2) = P (X = 1) + P (X = 2)
= 1/6 + 1/6
= 1/6
What can you do if the events are not mutually exclusive?
Use a more general sum rule:
P (A or B) = P (A) + P (B) −P (A and B)
P (M or SP ) = P (M ) + P (SP ) −P (M and SP )
How do we calculate P (A and B)? (3)
The product rule:
P (A and B) = P (A) ∗P (B) (if A and B are independent)
P (A and B) = P (A) ∗P (B | A) (if A and B are dependent)
P (A and B) = P (B) ∗P (A | B) (if A and B are dependent)
When are two events independent?
one event cannot influence the other’s
outcome
Mathematically when are two events independent?
Is P (Y = passedMath) = P ((Y = passedMath | X = M )? Or Is P (X = M ) = P (X = M | Y = passedMath)?
If so, P (X = M and Y = passedMath) are independent.
aka if the probability of a given b is the same as the probability of a
In probability theory, what is conditional probability?
A measure of the probability of an event given that (by assumption, presumption, assertion or evidence) that another event has occurred.
What does a probability distribution show?
Can be thought of as providing the probability of occurence of different possible outcomes in an experiment.
What kind of mathematical function can we use to describe the expected number of heads and tails when we flip a coin n times?
Binomial distribution (two terms): P(k successes) = (|n,k|) p^k(1-p)^n-k
.
.
What is meant by a Xhosa exam
Exam where you are given a word in the african language Xhosa and two possible translations. Ideally you are guessing and so it forms a normal binomial distribution between correct and incorrect answers
What is the probability of getting exactly 01011 in a Xhosa exam?
0.5 x 0.5 x 0.5 x 0.5 x 0.5 = 0.0312
You take a Xhosa exam with 5 questions. The suym score is 3. How many possible ways can you get that sum score?
What function in R would help with this?
choose(5, 3)
How do we calculate the probability of getting a sum score of 3?
You could get all the possible series resulting in 3 and divide them by all the possible series, but there is an easier way using the d binomial function in R:
p
How do we calculate the probability of getting a sum score of 3 or less?
You add the probabilities of getting sum score of 0 - 3, e.g in R:
1 - (choose(5, 5) * p^5 * (1-p)^0 + choose(5, 4) * p^4 * (1-p)^1) #1 - P(4) + P(5)
#or simply pbinom(3, 5, .5)
How can you calculate which lowest sum scores have a probability of 30% or less in R?
qbinom(.3, 5, .5)
Statistical inferences makes propositions about a population, using data drawn from the population with some form of sampling
How can we calculate how likely our observed data is? (2)
- Sampling distributions
- Central limit theorem
How do we quantify the expected variation of our estimation procedure?
Confidence intervals (next cue cards)
The sumscores on 10 Xhosa items have, for the population of 1000 psychology students, a binomial distribution
What would n, p and mean sumscore, u be?
n = 10 (items) p = .5 (probability of a correct answer) u = 5 (mean sum score)
What is meant by a sample, population and sampling distribution?
A population distribution is the distribution of a variable in the entire population. A sample distribution is the distribution of a variable in a sample of the population. A sampling distribution is the distribution of a sample statistic (e.g the sample mean)
For example in the previous example the population mean was 5. However in a sample of 20 people, the mean might be 4 in the sample distribution. If 100 samples were taken from the population then the means of all these samples could form a sample distribution.
Imagine that the mean psychology student’s Xhosa skill is 5, in a sample of 25 PM students a mean of 6 is found
How do we know how (un)likely this sample mean is in relation to the population of psychology students? (In R)
Simulate the data in R:
means
How can you adapt this code so that it gives the probability of an outcome or higher??
means
set.seed(1)
means = 6)
Why is the area of a sampling distribution difficult to calculate?
Your sample mean must come from the same distribution. Which one is that?
You can’t just keep simply taking that many samples (expensive and time consuming)
What solution is there to these problems?
Central limit theorem: Provided the sample size is large enough, the sampling distribution of the sample mean will be close enough to normal irrespective of what the population distribution looks like
Describe 2 examples which demonstrates the robustness of the central limit theorem
Take a binomial distribution with two peaks centred around x = 0 and x=1 ( B(n = 1, p = .5). If you take a high number of these samples in which the outcome can either be 0 or 1, you will have a normal sampling distribution centred around 0.5.
A gamma distributiuon (right tailed, e.g RT) will also form a normal sampling distribution when the means of multiple samples are collected
What do we need in order to use CLT to determine how unlikely a mean of 6 is in a sample distribution?
We need to know the mean and standard deviation of the sampling distribution
In the Xhosa example, how do we determine the mean and std of the sampling distribution?
Assume the mean Xhosa of psychology students is 5 based on probability.
The central limit theorem also holds for the standard deviation (originating from a normal distribution), we can therefore use this standard deviation for the sampling distribution: s / sqrt(n)
What name is given to the standard distribution of the the sampling distribution?
The standard error (SE)
What do we do after we get these figures?
Once we know the exact distribution of our null hypothesis, we can calculate the area under the curve and thus the likelihood of our mean (P value).
That is, using the data drawn from the population s=with some kind of sampling, we can make a proposition about a population.
Why can the sampling distribution only be used for the mean?
No reason, it can be used for the median, standard deviation etc.