Normal Distribution and Statistical Hypothesis Testing Flashcards
(45 cards)
is a distribution that is symmetric about the mean, continuous variables
Normal Distribution/ Probability Distribution Curve
mean, median, mode are approximately same
Probability distribution curve
Normal (no skewed)/ Gaussian Curve
(Graphical distribution of probability
Data is spread normally)
mean, median, mode are not same
Many outlier that are lower than mean
Negatively skewed
(Mean < median < mode (mean and median is greater than the mean)
Left tail is dragging mean)
Many outliers that are higher than the mean
Mode < median < mean
Positively skewed
Data set that has extreme values larger than the mean
Right tail is dragging mean
distortion
Asymmetry that deviates from the symmetrical bell-curve/ normal distribution
Imbalance in the distribution of data relative to the mean
Skewness
Values plotted will be -3 to +3
extreme observations/ values, far from the values observed
Outliers
measures the peak or flatness of data set
Kurtosis
negative kurtosis, flat/ low peak, short tail
Platykurtic
normal distribution (kurtosis)
Mesokurtic
Characteristics of a Normal Distribution:
follows a consistent pattern
Earliest distribution to be well studied
First equation is derived by Abraham de Moivre
Bell shaped and symmetric about the mean
(Friedrich Gauss: work became well known
Symmetrical and mesokurtic)
Characteristics of a Normal Distribution:
Mean = median = mode
____: center of the curve
____: only 1, peak of curve
Mean
Mode
Characteristics of a Normal Distribution:
Total area under the curve (AUC) is ____%
1 or 100
Characteristics of a Normal Distribution:
Has long tapering tails that extend infinitely in either direction but never touching the x-axis
____: lines that as it gets closer but never reach/ touch the x-axis
Asymptote
Characteristics of a Normal Distribution:
Completely determined by two parameters, its ___ (μ) and __ (𝜎, sigma)
____: location of the curve in x-axis
____: spread of curve
Mean
SD
(𝜎 increases, distribution becomes wider; low peak
𝜎 decreases, distribution becomes thinner; high peak)
1 SD covers __% of the distribution
2 SD covers __% of the distribution
3 SD covers __% of the distribution
68
95
99.7
Importance of Normal Distribution:
Useful for explaining many ______ _______
Even if the distribution of the variable is not normal, can easily transform using log, square root or other transformation to make it approximate the normal distribution
Most measurements/ variables are normally distributed
biological phenomena
Importance of Normal Distribution:
Plays an important role in ______ ______because:
Many statistical test assume normality of the distribution
The other important distributions (binomial, t-distribution) can be approximated by the normal esp. when the sample size is large enough
The sampling distribution of the mean is approximately normal if the sample size is large enough (central limit theorem)
statistical inference
Mean =0, SD =1
Capital ‘Z’ is traditionally used to represent the standard normal random variable
Small letter ‘z’ is used to represent a particular value of Z
The Standard Normal Distribution
Areas under the standard normal are tabulated
Any value x from the normal distribution can be transformed into a standard normal value of z using the formula
Statement about the value of a population parameter
Mean, median, mode, variance, SD, proportion, total
Statistical Hypothesis
Assertion or proposition about the relationship between 2 or more variables
Concerned w/ the parameters of population
Statistical Hypothesis
Formulated as a result of years of observation and research
Method of making decision using data, whether from a controlled experiment or an observational study (not controlled)
Statistical Hypothesis
Set of procedures that culminates in either rejection or non-rejection of null hypothesis
Involves the comparison of two hypotheses: Null and Alternative
Statistical Hypothesis
Hypothesis: prediction, educated guess of the results, should be testable
p< alpha: probability of occurrence of sample results is low, _____ null hypothesis
p> alpha: probability of occurrence of sample results is high, _____ null hypothesis
reject
not reject
7 Steps of Hypothesis Testing
SSSDCMD
State the null (H0) & alternative (H1 or H0) hypothesis
State the Level of Significance
Select the appropriate test statistic/ test criterion
Determine the Critical Region or Region of Rejection
Compute the test-statistic
Make a Statistical Decision
Draw conclusion