Six Sigma Statistics and Graphical Presentation Flashcards
By Ron Crabtree
Sample
Subset of the overall population.
Make sure they are representative samples
Three most standard descriptive/characteristic statistics
Mean (arithmetic average)
Standard deviation
Variance
What are the symbols for the three most common data characteristics for both population parameters and sample statistics?
MEAN
Pop Par: mu
Sample stat: x-bar
STANDARD DEVIATION
PP: sigma
SS: s
VARIANCE
PP: sigma squared
SS: s squared
Descriptive statistics
Used to describe the process itself
One of most common tools: histogram. Variation, centering
Inferential statistics
Making inferences about the population from your sample
It’s possible to learn meaningful information with as little as 30 measurements.
Compare descriptive vs. inferential statistics
DESCRIPTIVE
Approach: More inductive (induce information)
Goal: Summarize the data to make decisions
Tools/Techniques: Histograms, interrelationship diagrams, process maps, fish bone diagrams
Interpretations: Fairly straightforward, Not as difficult to create
INFERENTIAL
Approach: Deductive (deduce information)
Goal: Infer population characteristics to predict future outcomes
Tools/Techniques: More advanced/complex, Chi squared, binomial, poisson distributions, hypothesis testing, confidence intervals, correlation, regression analysis.
Interpretations: Complex
Normal distribution
Most of the values in the data set are close to the average for the data. Standard deviation is small. Also allows for easy inference.
AKA The bell-shaped curve.
The 69-95-99 Percent Rule \+- One St Dev: 68.26 % Rule \+- Two St Dev: 95.44% Rule \+- Three St Dev: 99.74%
What are the basic tenants of the central limit theorem?
Basic tenants:
The sampling distribution of the mean approaches a normal distribution as the sample size increases.
n = 100, get a curve n = 500, a peak appears n = 1000, a normal distribution appears
As you increase samples, you get closer to a perfect bell curve
Something something central limit theorem
n = sample size for the sample mean
n = 4, get a near normal sampling dist
n = 30 will make the distribution normal
Basic tenants of confidence interval
Used to state some level of confidence that the mean of your population falls within a certain range
- Collect data for sample
- Calculate mean and standard deviation of sample
- Then make inference
Hypothesis testing
Test a null hypothesis, or a state of nature of which you do not know the true outcome.
H-naught (H0) typically set to test of two values are equal, or if greater/lesser than or equal to
H-sub-a: alternative of the null hypothesis.
Use data to infer the true state of the population
Control chart
Typically plotting data pulling at a consistent rate. Pooling samples
X axis is values (ex. 20 values), i.e. pulling 5 parts every hour and giving mean of those
Center line (mean of process)
UCL
LCL
Infer data about the entire population therein
Measures of central tendency
Whether or not the center of your process falls close to your target. Looking at the centering of your process
Measures of dispersion
How much variation within your process
What performance does six sigma aim for?
On target performance
As little variation as possible