new Flashcards by Caelen Burand

Associative Law for Union and Intersection

(EUF)UG = EU(FUG)

E or F or G = E or F or G

EF(G) = (E)FG

E and F and G = E and F and G

How well did you know this?

Not at all

Perfectly

Chubyshovs inequality

If we are trying to identify how much of a dataset lies between the values of x̄ +-ks where s = standard deviation and k = some number then

% min = 100(1- 1/(k²))

k>1 otherwise the probability = 0

How well did you know this?

Not at all

Perfectly

Communitive law for union/intersection

E U F = F U E

Event E or F = Event F or E

EF=FE

Event E and F = Events F and E

How well did you know this?

Not at all

Perfectly

Compliment

If we have an event, E, within the sample space, S then E^c = compliment of E and includes everything that is not E

How well did you know this?

Not at all

Perfectly

Correlation Coefficient

r = [Σ (x_i - x̄)( y_i - ȳ)]/(n-1)s_xs_y = [Σ (x_i - x̄)( y_i - ȳ)]/[Σ (x_i - x̄)²( y_i - ȳ)²]^.5

This says that if we have a paired dataset such that x_i,y_i are the pairs and are described by their respective means such that y = mx + b then this statistic will indicate the linearity of the pairs of data

How well did you know this?

Not at all

Perfectly

Cumulative Frequency

This shows the bins as a function of an additive frequency.

These are also called Ogives

How well did you know this?

Not at all

Perfectly

Demorgans Laws

(EUF)^C = E^CF^C

E or F do not occur = E not occurring and F not occurring

(EF)^C = E^C U F^C

E and F not occurring = E not occurring or F not occurring

How well did you know this?

Not at all

Perfectly

Gini Coefficient

The gini coefficient (G) is the integral of the area between L(p) = 1 and the Lorenz Curve. It has a maximum value of .5 and a minimum value of 0

G=1-2B where B = area under Lorenze curve, L(p)

How well did you know this?

Not at all

Perfectly

How to use the weighted probability of E?

If tasked with finding P(F|E) where E is the second event

Then P(F|E) = P(FE)/(P(E)

where P(FE) = P(F)P(E|F) and P(E) = P(F)P(E|F) + P(F^c)P(E|F^c)

How well did you know this?

Not at all

Perfectly

Independent events

If P(E|F) = P(E) then E and F are independent and E is not a function of event F

and P(EFG)=P(E)P(F)P(G)

These occur when two events occur at the same time or in distinct independent localles.

Ex: Covid cases in Mexico and Greenland on some day, two balls being selected from a jar at once (if not at the same time their chances are not independent)

How well did you know this?

Not at all

Perfectly

Lorenz Curve

This is a cumulative curve showing the income distribution

How well did you know this?

Not at all

Perfectly

mean

x bar = Σx/n = Σ v*f/n

where v = bin value and f = frequency

How well did you know this?

Not at all

Perfectly

Mean vs. Median when to use

Generally mean gives a better understanding of the dataset in terms of describing the data. The median should be used when probabilities are involved and/or the value is being used to understand the order of a group.

Ex: Housing. The mean income would be best for determining what the average person in an area can spend on a home but if we want to design housing where we could expect 50% of the population could live (P(Affordable)=.5) then the median is more useful.

How well did you know this?

Not at all

Perfectly

Median

This is the middle value of a sample when data is arranged from least to greatest

If n is odd then the median value occurs at n = (n+1)/2

If n is even then the median is the average of (n/2)+1 and n/2

How well did you know this?

Not at all

Perfectly

number of unique groups in a set

If we have a collection of n = # of objects and we want to know how many unique combinations of size r can be made when order matters

= n!/[(n-r)!r!]

This says that for a sample size=n that we can arrange “r” elements this many ways uniquely. If we have a sample space this will define the number of potential outcomes that are possible.

If we want to know the possibility of a subset occurance within a group this will be given by (# of combinations in subset 1)*(#comb in 2)/(total # comb)

How well did you know this?

Not at all

Perfectly

P(E) = ? as a weighted average

Study These Flashcards

P(E) = P(E|F)P(F) + P(E|F^c)(1-P(F))

The P(E) = The weighted average of E occurring if F has occurred and if E occurs and F does not occur

Where E occurs as the consequence of F.

P(E) Expansion using compliments

Study These Flashcards

P(E) = P(EF) + P(EF^C))

Probability of E = Prob of E and F + Probability of E and not F

P(E|F) = ?

Study These Flashcards

P(E|F) = P(EF)/(P(F)

Probability of E occurring given F has occurred = the probability E and F occur divided by the probability F occurs

P(E|F^c) = ? (expand)

Study These Flashcards

P(E|F^C) = P(EF^C)/P(F^c)

This is says that the probability of E occurring given that F does NOT occur equals the probability of E occurring and F not occurring divided by the probability F does not occur.

Permutations

Study These Flashcards

This is a specific arrangement of a set of objects where the total number of permutations available to a subset of things is equal to n! where n is the total number of things in the subset

r meaning

Study These Flashcards

If the slope relating y and x is <0 then r <0 and vice versa. the absolute value of r indicates the linearity of the relationship

If r is for (x, y) where w = a + bx and z = c + dy then

r(x,y) = r(w,z)

Sample 100p percentile

Study These Flashcards

The data point equal to where less than 100*p% of data lies. It includes that data point

p=probability as a decimal

Sample Space

Study These Flashcards

S = sample space = all possible outcomes to some experiment. This can be both discrete or nominal data. The subset of data is the event

ex: an experiment predicting the gender of children

S = {g,b} and E={g} F={b}

Sample spaces with equally likely outcomes

Study These Flashcards

This refers to a sample space where each outcome has an equal probability of occurring, aka there is no weight to a particular outcome

In this scenario P(E) = 1/N = p

These sample spaces have a total number of outcomes given by n! where n is the number of objects in the sample space and n! gives the total number of unique combinations of these objects. If there is a number of experiments, m, each with n number of outcomes then the total number is m*n

Three axioms of Probability

0 2: P(S) = 1 3: P(U_iⁿ E_i) =Σ_iⁿ P(E_i) = P(E₁) + P(E₂) +... +P(E_n) This is assuming that E_i and E_i+1 are mutually exclusive and says that if this is true that the probability of one of the events occurring is equal to the summation of each individual events.

Union

E U F = E or F which means that any outcomes within either subset or event E or F are valid.

When is conditional probability particularly useful?

It is used when there is **limited information** within a problem (you are attempting to derive the probability of an event based on other events) or It is the easiest way to find the probability of a cause or input to an event with new information (backwards reasoning)

P(E or E^c) = ?

P(E or E^c) = P(S) = 1

P(E or F) if E and find are not mutually exclusive

P(E or F) = P(E) + P(F) -P(EF)

Odds of an event

The odds of an event occuring is the ratio of the event occuring to not occuring odds = P(E)/P(E^c)

Mass density function

This is the probability of a function being equal to or less than a given value hence it is the integral from the lower bound to some value.

density function

this is a function which describes the probability of a random variable equalling a specified value. The integral of the density function is the mass function.

If given a joint density function and asked to find P(x\>y) what is the procedure?

the P(x\>y)=∫ ∫ ₀^x f dy dx This says that the probability of y being _less than_ x is equal to the mass probability function of y=x where the bounds on x are negative and positive infinity

Procedure for finding the joint probability function for discrete events?

1. ID variables 2. ID if independent (if independent then the probabilities can be separately evaluated and then multiplied using "and" statements) 3. Logic through the first few probabilities 4. Try to identify a pattern that can be used for combination/permutation notation 5. If possible create an equation. If not, work through each probability.

How to use combinations to find the discrete chance of an event occurring?

P of an event occuring is equal to the combination of the number of variables within the scenerio into the number of anticipated responses. This is divided by the total number of options per event. In other words this is the total number of unique combinations of events divided by the number of those that meet the criterion divided by total events. This results in the number of events that meet the criterion specific over the total number of events or the probability.

Random Variable

A random variable is a variable that does not describe a discrete event but the meaning of an event. It is a numerical value which is derived from the result of an experiment.

Expectation formula

The E[X] = ∫ P(X=x)\*X This is the weighted average of the variable's values

new Flashcards

(38 cards)