Lesson 3: Normal Distribution Flashcards
Distribution
- histogram of sample space (all possible sample values)
- X-axis: all possible values in sample space
- Y axis: Frequency of the sample value
Standardized values
z value
- Compute Probabilities
- Compare two different distributions
- We compute standardized values
- z = 2 : Data value is 2 standard deviations above the mean
- z = -1.6 : Data value is 1.6 standard deviations below the mean
z = (Data value (y) - mean(ΞΌ)) / Standard deviation (Ο)
p-value
- Represents the area under the normal distribution curve towards left side
- 1.0 z-value = 0.15 + 2.35 + 13.50 + 34.00 + 34.00 = 84% p-value (0.8413)
- area under normal distribution curve = 1
p-value calculation for specific range
example
probability of student scoring b/n 450 and 600 on SAT
mean = 500, sd = 100
z = (600-500)/100 = 1.0 = p-value 0.8413
z = (450-500)/100 = -0.50 = p-value 0.3085
0.8413 - 0.3085 = 0.5328 or 53.28%
Standard normal curve
(π=0, π=1)
- Symmetric about its mean π=0,π=1
- Mean = Median = Mode
- Single peak at z=0
- Inflection point at β1πππ+1
- Area under the curve = 1
- Area of left ( ππππ π=0) = Area of right = Β½
- Follows the Empirical Rule
Normalizing Data
Computing z-values
Excel and R
- Excel: =STANDARDIZE(data value, mean, sd)
- R: scale(data, mean, sd)
Computing p-values
π(π§)π₯ < π§
Excel and R
βLeftβ Area (Probability) under Standard Normal Curve
- Excel: =NORMSDIST(z-value) Normal standard dist
- R: pnorm(z-value) Cumulative Density Function (cdf)
βRightβ Area (Probability) under Standard Normal Curve
- Excel: =1-NORMSDIST(z-value)
- R: 1 - pnorm(z-value)
βIn Betweenβ Area (Probability) under Standard Normal
Curve
- Excel: =NORMSDIST(high z) - NORMSDIST(low z)
- R: pnorm(high z) - pnorm(low z)
Converting p-value (βleftβ area) to z-value
Excel and R
- Excel: =NORMSINV(p-value)
- R: qnorm(p-value) Quantiles
Relative Frequency
(Histogram)
- Individual frequency / total frequencies
- Histogram y-axis = Density
- frequency total area = 1.0 (100%)
- aka Probability Distribution Function (PDF)
Uniform Distribution Function
- When the probability of all possible events in the sample space is same
- eg. 6 dice sides = equal probability
- R: runif(n=?) = generate Random Uniform Distribution
Continuous Distributions
types (12)
- Uniform
- Normal
- Chi Square
- Fisherβs F
- Studentβs t
- Gamma
- Exponential
- Beta
- Cauchy
- Lognormal
- Logistics
- Weibull
Discrete Distributions
types (5)
- Binomial
- Poisson
- Hypergeometric
- Negative Binomial
- Wilcox
Normal Distribution
properties
- Symmetric about its mean π
- Mean = Median = Mode
- Single peak at π₯=π
- Inflection point at πβπππππ+π
- Area under the curve = 1
- Area of left ( ππππ π) = Area of right = Β½
- Follows the Empirical Rule
Density Distribution
R function
dnorm()
Random Numbers Distribution
- R: rnorm(n= ?, mean= ?, sd= ?)
- generates n numbers normally distributed with specified mean and sd
- clean histogram and QQ plot if n is big enough