Probability And Statistics - Key Words Flashcards
What do we use probability and statistics to do?
1) Collect and organise data
2) Explore descriptive relationships (this will bring together two different variables
3) Investigate casual relationships (to see if there is a significance between an underlying cause and the produced results)
When is it appropriate to use statistics?
When 1) there is a large number of 2) similar processes or phenomena (I.e when something is repeated many times.
What is a random experiment?
A random experiment is a process that leads to the occurrence of one and only one of several distinct possible results, which can in principle, be replicated.
This is an experiment because it can be replicated.
What is a random experiment?
A random experiment is a process that leads to the occurrence of one and only one of several distinct possible results, which can in principle, be replicated.
This is an experiment because it can be replicated.
What is the outcome of an experiment?
This is one of the distinct possible results of an experiment.
When conducting a random experiment, we assume that we know all the possible outcomes of these random experiments, excluding exploratory experiments which lead to unexpected results.
What is a sample space.
A sample space is the collection of all the possible outcomes of an experiment. This is denoted by the Greek symbols Omega.
What is the complement of an event?
This is an event that does not occur in event X, and it will be denoted by a squiggly line in front of the X (~), or a bar on top of the X.
Remember that the complement of an event is also an event
What is the complement of the whole sample space?
This is called a null or empty set, but it is still an event.
What is an event?
An event is a collection of one or more outcomes, or the null set.
How can we combine events?
Then can either be combined in Union (means or), or they can be combined in intersection (meaning ‘and’).
These can both be known as event C.
This can extend to more than just two events, as events are associative, they can simply just be added in any order.
What are mutually exclusive events?
If the intersection is a null or empty set, then these events will mutually exclusive, also known as disjoint.
What are collectively exhaustive events?
This is if the union of two or more events is the sample space, in which case they are collectively exhaustive.
What is the definition of probability?
This is the assignment of number P(A) to A, which must abide three conditions.
What 3 conditions must probability abide to?
1) The probability must be greater than zero for any event A.
2) The probability of the whole sample space must be 1
3) The probability of the Union of all mutually exclusive events will equal the addition of all the individual mutually exclusive probabilities.
As a result, all probabilities will lie between 0 and 1 (inclusive).
What are the three approaches to probability?
1) Classical probability approach
2) Empirical/relative frequency probability approach
3) Subjective probability approach
We must calculate and understand the probability based on the context, different approaches will be necessary at different times.
What is the classical probability approach?
If a random experiment can result in n mutually exclusive and equally likely outcomes and n_a of these outcomes have attribute A, then the probability of A occurring is the fraction n_a/n.
However, the probability of different events occurring won’t always have likely outcomes as shown by Raphael Weldon when he threw 12 dice over 26,000 times.
What is the empirical/frequenting probability approach?
The probability of an event is the fraction of times that it has occurred in the past under the same experiment, if it has been repeated a large number of times.
However, not all experiments can be repeated (e.g the probability that the UK will have left the EU by 2020). This leads to a subjective probability approach.
What is a subjective probability approach?
This is when the probability of an event is assigned by an individual on the basis of his or her beliefs and information. An individual with different beliefs or information may assign a different probability.
There is no restriction of where these beliefs could come from.
Why does applied scientific research impact our experiments?
We try to use scientific research to predict uncertain outcomes and hence reduce the element of randomness. We do this by trying to understand how an event has occurred.
How do we deal with complexly experiments?
We can break them down to make them smaller experiments.
What does it mean to enumerate or list a complex experiment?
Let the outcomes be denoted by their ordered pair {D1, D2}.
Then treat the complex experiment as many smaller experiment in their orders.
What is the multiplication rule?
If outcomes of a random experiment can be represented by an ordered n-tuple, with the first component any of K1 outcomes etc, then the total number of possible outcomes will be K1 x K2 x …. x Kn.
This applied to both sampling with and without replacement.
What are permutations?
These are the outcomes when sampling r objects from a set of n different objects, and the order they’re in matters. There will be n!/(n-r)! different outcomes.
Remember 0! = 1.
What are combinations?
Combinations occur when sampling r objects from a set of n different objects, without replacement and where the order doesn’t matter.
There will be n!/[r! (n-r)!] different outcomes, which is equal to the number of permutations divided by r!.
What formula should be used to work out the number of outcomes when order matters, and there is no replacement?
n!/(n-r)! ,
where r is the objects (quantity of numbers in the lottery) and n is the number of possible numbers to choose from.
What formula should you use to see the number of possible outcomes when the order does matter and there’s replacement?
n^r, where n is the range of numbers you can select, and r is the quantity of numbers needed to be picked.
What formula should be used to see the number of outcomes when you have no replacement, and order doesn’t matter?
n! / [r! (n-r)!], these are called combinations.
What formula should be used when there are r sampling objects, n different objects, there is replacement and order doesn’t matter?
(n + r - 1)!/[r! (n-1)!]
What is conditional probability?
This is when you have to find the probability of an event A happening, given event B has already occurred.
Essentially, event B is the new sample space.
Key facts about a set of playing cards…
-52 cards
-13 denominations (2 to 10, ace, Jack, King, Queen)
-4 suits: Diamonds and hearts are red, Club and spades are black
What is the probability multiplication rule?
P(AnB) = P(B) x P(A|B) = P(A) x P(B|A).
Remember, the denominator cannot equal zero.
What is the probability addition rule?
P(AuB) = P(A) + P(B) - P(AnB)
What does it mean if two events are Independent?
P(A) x P(B) = P(AnB)
What is the equation for Bayes’ theorem?
If events A1, A2, A3 etc are mutually exclusive and collectively exhaustive, and Ai is any one of these events:
P(Ai | B) = P(Ai)P(B|Ai) / [P(A1)P(B|A1) + P(A2)P(B|A2) + … + P(An)P(B|An)]
What is a random variable?
A variable X will be a random variable (r.v) if it’s values can be though to be determined by outcomes of a random experiment.
This means that X will depend on the outcome we get is.
What is a discrete random variable?
This is a random variable with values determined by a finite or a countably infinite number of events.
What is a continuous random variable?
A continuous random variable is a random variable with values determined by an infinite number of events which are not countable, for example drawing a random real number between 0 and 1.
What is probability mass function?
If X is a discrete random variable, and S is the set of values determined by outcomes of the random experiment, then for function f(x):
-f(x) >= 0 for all real numbers x
-the summing over all the values that have a positive probability of occurrence will equal 1.
There may also be values of X that aren’t determined by the outcomes of a random experiment, and these will have a value of zero. When this occurs, we think of X=0 as implying the experiment has not taken place.
How do probability mass functions (pmf) assign probabilities?
P(X=x) = f(x).
This means the probability the random variable X takes is equal to pmf evaluated at x.
What is a probability density function (pdf)?
For a continuous random variable where f(x) >= 0 for all real values of x, and the definitive integral from -infinity to infinity is 1, you get a probability density function.
How do cumulative distribution functions work (cdf)?
F(x) = P(X <= x)
As it is cumulative, the probability X is the always either equal to or less than what p(x) will be, as it will be at least as high as the current probability.
What is the expectation of random variables?
E(X) = sum of x(i) x f(xi)
I’n words it is the weighted average of its values, where weights are the corresponding values of the pmf, it is a measure of central tendency of probability around the random variable.
The probability of the random variable could be zero, but the expected value is just how the probability is distributed around this point.
What is the variance?
V(X) = sum of (xi- mu(xi))^2 x f(xi)
In words, the variance of a random variable is the weighted average of the squared deviations of its values from the mean, where weights are corresponding to values of the pmf (their probability.
The standard deviation is the positive square root of the variance.
What is the equation for the variance of a continuous random variable?
V(x) = the finite integral of (x - mu(x))^2) x f(x).
What is the median of a random variable?
The median of a random variable is the point that splits the distribution of X into two parts which overlap at the median, each part will have a probability of at least 0.5.
If P(X = m) = 0, then both parts of the distribution are exactly equal to 0.5. This will always be the case for a continuous random variable, and may (but not necessarily) apply to a discrete random variable.
-Unlike the mean, the median doesn’t necessarily have to be a unique value.
What is the difference between the mean and median?
The mean is the constant that minimises E[(X - c)^2], and the median is the constant that minimises E(|x - c|).
From this, we can see that the mean will be more affected by extremes, because it will square the prediction error, where as the median keeps a constant absolute value no matter how large the error is. This means when large anomalies are present, the median will be the more accurate measure.
How can we get the expected value of a function of a random variable?
If we let a random variable Y = g(X) be a function of X, where X has a pmf or pdf f(x), then Y is also a random variable. Therefore, E(Y):
-If discrete = sum of g(x) multiplied by f(x).
-If continuous = the integral of f(x) times g(x).
We have however seen this before, as g(x) is equal to (x - mu(x))^2.
Why would a grade received in an exam be a random variable?
This is because the marks are dependent on the random factors of each student that impact their ability to do well on the exam (e.g sleep quality night before, cognitive ability, time revised for etc).
How can we check if there is a casual relationship between two random variables?
To do this, we can use random variable techniques, so we can see what would’ve happened if something else didn’t happen.
To see if one variable is associated to another, we can use conditional probabilities, and see what the correlation between, say, grades in 2nd year maths would be, given you get a specific grade in 1st year maths.
What are the issues we still have when using condition probabilities to see casual relationships between two random variables?
1) We cannot hope to easily summarise the relationship between the two marks.
2) We cannot hope to derive features of their relationship between these two variables to compare it to other random variables.
What is joint relative frequency?
This if the joint probability of two random variables both occurring at the same time, and this can be displayed by a table.
What is the joint probability mass function?
If you have two discrete random variables X and Y, then there will be a function f(X, Y) such that:
1) f(X,Y) >= 0 for all pairs of real numbers (x, y).
2) the double summation of all the pairs of values X and Y will equal 1. To see how to do double summation, look at lecture slides 4.
The probability assignment P(X = x, Y = y) = f(X = x, Y = y)
What is the joint probability density function?
For continuous random variables X and Y, a function f(x,y) such that:
-f(x,y) is non-negative for all real numbers (x,y)
-The double integral is equal to 1. Bare in mind, as this is a 2 variable function, the function will make a 3D graph. The probability will be represented by the volume under the p.d.f surface.
What is joint cumulative distribution?
F(X = x, Y = y) = P(X <= x, Y <= y). We will add up all the probabilities which apply to this constraint, for both variable by looking at the joint frequency table.
What are marginal distributions of discrete random variables?
When we have joint distribution we can derive the probability mass functions of each of the random variables that are jointly distributed.
From the joint probability mass function f(X,Y) we can derive the pmf of X and Y denoted f1(X) and f2(Y), as they are an ordered pair, with X first and Y second.
How would we find f1(x) of a marginal distribution function?
1) We need to find f(x,y) for all of x such that f(x,y) is strictly positive
2) Sum all the strictly positive pairs for all the values of x which fit the criteria in step 1.
3) once all are summed, this should equal 1, it may not if the decimals have been rounded early.
What is the joint probability mass function?
If X and Y are discrete random variables, then the function f(X, Y) such that:
-f(x,y) >= 0 for all pairs of real numbers (x, y).
-the double summation of the joint probabilities should all add to one. To see how to do this, refer to lecture notes.