Probability Basics Flashcards
What is Independent
each event is not affected by other events
Probability of an event happening =
Number of ways it can happen
/ Total number of outcomes
What is Dependent
also called “Conditional”, where an event is affected by other events
A and B = A and (A l B)
“Probability of event A and event B equals
the probability of event A times the probability of event B given event A”
What is Mutually Exclusive
events can’t happen at the same time
Permutations with Repetition
n^r
where n is the number of things to choose from,
and we choose r of them,
repetition is allowed,
and order matters.
Combination
When the order doesn’t matter
Permutation
When the order does matter
Permutations without Repetition
n!
/ (n − r)!
where n is the number of things to choose from,
and we choose r of them,
no repetitions,
order matters.
Combinations without Repetition
n!/(r!(n-r)!)
It is often called "n choose r" where n is the number of things to choose from, and we choose r of them, no repetition, order doesn't matter.
Combinations with Repetition
(r+n-1)!/ (r!(n-1)!)
where n is the number of things to choose from,
and we choose r of them
repetition allowed,
order doesn’t matter.
This is the same as a combination without repetition where n = r + n - 1
Standard Deviation definition (not formula!!)
The formula is easy: it is the square root of the Variance. So now you ask, “What is the Variance?”
“What is the Variance?”
The average of the squared differences from the Mean.
Var(X) = Σx^2p − μ^2
To calculate the Variance:
square each value and multiply by its probability
sum them up and we get Σx^2p
then subtract the square of the Expected Value μ^2
μ = expected value = Σxp
Formula for The “Population Standard Deviation”:
and
The “Sample Standard Deviation”:
Population Standard Deviation
square root of [ (1/N) Σ of (x - mu)^2 ]
Sample Standard Deviation”
square root of [ (1/(N-1)) Σ of (x - x^bar)^2 ]
Weighted mean standard deviation
Square root of [Σ(x^2 (p) − μ^2)]
μ = expected value = Σxp
What is The “Bell Curve” or a Normal Distribution.
The Normal Distribution has: 1. mean = median = mode 2. symmetry about the center 3. 50% of values less than the mean and 50% greater than the mean
4. Standard deviations: 68% of values are within 1 standard deviation of the mean 95% of values are within 2 standard deviations of the mean 99.7% of values are within 3 standard deviations of the mean
“Standard Score”, “sigma” or “z-score”.
The number of standard deviations from the mean
z = (x − μ) / σ
z is the “z-score” (Standard Score)
x is the value to be standardized
μ (‘mu”) is the mean
σ (“sigma”) is the standard deviation
Correlation
When two sets of data are strongly linked together we say they have a High Correlation.
Correlation can have a value:
1 is a perfect positive correlation
0 is no correlation (the values don’t seem linked at all)
-1 is a perfect negative correlation
“Correlation Is Not Causation” - 4 reasons why
What it really means is that a correlation does not prove one thing causes the other:
One thing might cause the other??
The other might cause the first to happen -simultaneous reverse dependence
They may be linked by a different thing -hidden 3rd variable
Or it could be random chance! -spurious
Pearson’s Correlation formula
Step 1: Find the mean of x, and the mean of y
Step 2: Subtract the mean of x from every x value (call them “a”), and subtract the mean of y from every y value (call them “b”)
Step 3: Calculate: ab, a2 and b2 for every value
Step 4: Sum up ab, sum up a2 and sum up b2
Step 5: Divide the sum of ab by the square root of [(sum of a2) × (sum of b2)]
r = ( n Σ xy - Σx Σy ) / (square root( nΣx^2) - square root(Σx)^2) * (n Σy^2 - (Σy)^2) )
Bayes Theorem
“AB AB AB” then remember to group it like: “AB = A * BA / B”
P(A|B) = P(A) * P(B|A) / P(B)
Which tells us: how often A happens given that B happens, written P(A|B),
When we know: how often B happens given that A happens, written P(B|A)
and how likely A is on its own, written P(A)
and how likely B is on its own, written P(B)