Probability Theory And Probability Distributions Flashcards
Review what do histograms show vs bar charts
For categorical values , bar charts show what proportion of the sample have a certain value
For quantitative variables , histograms show what proportion of sample have a certain value
Probability distribution
Applies theory of probability to describe behavior of the random variable
Random variables and types
Variable whose value is an outcome of a random phenomenon.
Continuous random variable can take on any value within specified interval ie weight
Discrete random variable can only assume finite or countable number of outcomes ie marital status ( single, married , divorced )
Probability
Of a random value is the proportion of times that occurs in the population
What does discrete random variable specify
It specifies all possible outcomes of the random variable along with probability that each will occur ie flipping a coin
For continuous random variables
Allows us to determine the probabilities associated with specified ranges of values
Things to note
A probability distribution can be thought of as a bar chart or histogram of the entire population
Rules of probability
- Probability must be between 0 & 1
- Sum of probabilities for all possible mutually exclusive events must equal 1
- P(not E)= 1-P(E)
- Additive rule of probability P( A or B)= P(A) + P(B)
- Multiplicative rule for independent events P(A & B) = P(A) *P(B)
Discrete probability distribution X
Lists all possible values of X and their probabilities
Continuous probability distribution of X
Summarize all possible values of X and their probabilities using probability density curve
Only use density curve if it provides good fit to our data
Can have variable shapes
Properties of probability density curves
Vertical scale - probability 0-1
Area under curve is equal to 1 ( because the area represents the sum of probabilities for all possible values of the variable
For continuous random variable , we do not talk about probability of an individual value rather the probability that the value lies within intervals
Normal distribution
Model for continuous data, need population mean and population sd to define a unique normal distribution
Properties of normal distribution
Takes values between positive infinity and negative infinity
Area under curve equals 1
Symmetric around mean . mean=median=mode
Examples; blood pressure , age , Hb
What does X~ N(mean, variance squared mean)
X is normally distributed with mean and variance squared
How does sd and mean affect shape of the graph
For sd causes the height to change
For means causes the graph shift to left or right
Standard normal distribution
Normal distribution with population mean mu =0 and standard deviation, sigma =1
How is standardizing of data done?
For each X value we calculate new Z value known as Z score
Formula for Z Score, what is Z score
Measures number of sd that data point X is from mean mu
Binomial distribution
Probability distribution for binomia experiment used with discrete data . Referred to probability of getting n successes in a specific number of trials
Bernoulli trial
Experiment where there are only 2 possible outcomes , the probability of success 0, is constant from trial to trial , so trials are independent
What is the probability of getting 2 heads after tossing coin 5 times
10
Assumptions for binomial distribution
Number of observation is fixies
There are only 2 outcomes
Each observation is independent
Probability of success p is constant from trial to trial