Probability & Statistics Flashcards
What are probability + statistics about?
These are both to do with analysing the situation where there are lots of related events.
What is probability?
Probability asks how likely it is that a certain event will happen.
What is statistics?
Statistics asks how to summarise all the events that happened.
What is the probability of an event happening?
The probability of an event happening is a real number in the range 0 (won’t happen) to 1 (will happen). If there are N trials + the event typically occurs M times, the probability of the event is M/N.
How do you represent situations which won’t happen?
To represent situations which don’t happen, use Ā. The area of the circle represents the probability of A happening, the total probability of all events is 1, so the probability of A not happening, p(Ā) is 1 – p(A).
How do you calculate events which are equally likely to happen?
If there are N equally likely events which can happen during a trial, then the probability of any one of them is 1/N.
How do you calculate mutually exclusive events?
If A + B are mutually exclusive (can’t both happen), then adding their probability gives the probability of at least 1 happening (p(A) + p(B)). p(both) = 0.
How do you calculate independent events?
If A + B are independent events, A has no effect on whether B occurs. The probability that both of them will occur is p(A) * p(B). If we are looking for the probability that either one occurs, p(A) + p(B) – p(A) * p(B).
How do you test the probability of B if A already happened?
What about if A didn’t happen?
p (B\A)
p (B\Ā)
How do you test the probability of B if A already happened + A + B are mutually exclusive?
What about if they’re independent?
p (B\A) = 0. If A happened, B can’t.
p (B\A) = P(B).
What is the case if 2 events are dependent?
The first affects the second e.g. p (B\A) ≠ p(B)
What is a central measure?
Typical result.
What is the mean of a sequence of values?
If s is sequence of values, mean of s satisfies len(s) * mean(S) = sum of s(i) from i = 1 to len(s). Mean will be affected by extreme values even if just a few.
How do you calculate the median?
Less affected by extreme values. Middle value. At least half of the sequence elements are less than/equal to it + half are greater than/equal to it. Must satisfy (card ( { I | I ϵ (inds (s) · s (i) ≤ M } ) ≥ len (s) / 2) ^ (card ( { i | i ϵ inds (s) · s (i) ≥ M } ) ≥ len (s) /2)
To find median, we sort into asc order + choose middle value. If sequence even num, no central element + take mean of 2 central elements.
Hod do we calculate mode?
Most common. Must satisfy: ꓯ i ϵ inds (s) · card({x | ϵ inds (s) · s (x) = s(i)}) ≤ card ({z | ϵ inds (s) · s (x) = M})
To find mode, sort s into asc order, find which value occurs greatest num of times.