CRP 109 stats Lecture 2 Flashcards
z Score
-The number of standard deviations that a given value x is above or
below the mean.
-Round z scores to two decimal places.
-It is expressed as numbers with no units of measurement.
-If an individual data value is less than the mean, its corresponding z
score is a negative
-Units have now been converted to “standard deviations away from the
mean” and can thus be compared
Random Variable
A variable, typically represented by x , that has a single
numerical value, determined by chance, for each outcome of a
procedure
Discrete Random Variable
Has a collection of values that is finite or
countable (even theoretically)
Continuous Random Variable
A collection of values that has infinitely
many values, and is not countable.
Probability Distribution
gives the probability for each
value of the random variable
-We use 0+ to represent a probability value that is
positive but very small. Rounding to 0 would be
misleading because it would incorrectly suggest that the
event is impossible
Probability Distribution Requirements
-There is a numerical (not categorical) random variable x , and its
number values are associated with corresponding probabilities
-sum of P(x) = 1
-P(x) is between 0 and 1 inclusive for all values of x
Probability Histogram
-vertical scale shows probabilities instead of relative frequencies based on actual sample results.
-The areas of the rectangles are the same as the probabilities from the
corresponding probability distribution table
-probability distribution can also be in the form of a formula
Expected Value (E)
-theoretical mean value of the outcomes for infinitely many trials
-Does not need to be a whole number
Bernoulli Trial
-A Bernoulli trial is an experiment with only two possible outcomes:
success or failure
Binomial probability distribution
outcomes belong to two categories
1. The procedure has a fixed number of Bernoulli trials. One Bernoulli
trial is a single observation.
2. The trials must be independent, meaning that the outcome of any
individual trial does not affect the probabilities in the other trials.
3. Each trial must have all outcomes classified into exactly two categories,
commonly referred to as success and failure.
4. The probability of a success remains the same in all trials
Binomial probability distribution notation
-S (success) and F (failure)
p = probability of a success in one of the n trials
q = probability of a failure in one of the n trials = 1 − p
n = fixed number of Bernoulli trials
x = specific number of successes in n trials
P(x) = probability of getting exactly x successes among
the n trials
Sampling With/Without Replacement
-The binomial distribution will be applicable in cases where we sample
with replacement.
-If we sample from a small finite population without replacement, the
binomial distribution should not be used because the events are not
independent
Hypergeometric Distribution
If sampling is done without replacement and the outcomes belong to one of two types (success/failure), we can use the hypergeometric
distribution
Poisson probability distribution
discrete probability distribution
that applies to occurrences of some event over a specified interval
1. The random variable x is the number of occurrences of an event in
some interval.
2. The occurrences must berandom.
3. The occurrences must be independent of each other.
4. The occurrences must be uniformly distributed over the interval being
used
-determined only by the mean μ.
-The possible values of x has no upper limit
μ = mean number of occurrences of the event in the intervals
Poisson Distribution as Approximation to Binomial
Requirements:
1. n ≥ 100
2. np ≤ 10
Then for the Poisson distribution, we need parameter μ = np