Non-Normal Data Flashcards
What is the Central Limit Therom?
This is when you take multiple samples from a non normal distribution. Meaning you create a normal distribution from data that was from a linear distribution or any other distribution. You do this by pulling several samples (Recommended 30) and get the mean of those samples to create a normal distribution.
What are the reasons for non-normal data to exist?
In order to use SPC with a process, that non-normal data must be transformed into normal data. Some of the reasons for non-normal data represent a process out of control and some of those could occur with a process that is in control. Let’s first consider the reasons and then what can be done with non-normal data.
What do too many extreme points indicate?
This indicates a process that is out of control
What do extreme points prevent?
The ability to predict process performance and you cannot use SPC until you can predict the process.
Overlap of multiple processes in the data?
This will often generate a distribution that is lumpy. A lump occurs at the center value for each of the processes in the data. The best approach is to stratify and separate the processes. However, you can also use the Central Limit Theorem to create a normal distribution.
Sorted Data
IN this case, the process or system automatically sorts the data into a specific order or the data at the extremes is automatically reworded so that it is closer to the central value. Move upstream in the data collection and use original data points instead of the “reworked” data. If you cannot do that, the Central Limit Theorem can normalize this data.
Natural Limit
In this case, one of the tails on the bell-shaped curve is truncated. This is due to an equipment or natural limit in the process. You can transform data skewed by physical limitation either through the Central Limit Theorem or other transformation process.
Insufficient data discrimination
In this case, the data is only able to take on a few values such as on/off or true/false. The raw data can never form a normal bell-shaped curve. You may be able to improve the measurement system. Otherwise, use the Central Limit Theorem to transform the data.
How many data points do you need when data is symmetrical?
5 and 30 data points when the data is not symmetrical.
What is the most popular way to transform data through transformation algorithms?
Box-Con Transformation
When does SPC not add much value?
If the non-normality is due to extreme points, you must first get the process under control by eliminating those causes
If there are multiple processes present?
it is best to separate those and put each one under statistical control. Otherwise, it is difficult to know what to fix when SPC indicates a problem
When do you use data points from the same time period or the same shift?
When creating the subsets or samples to support a Central Limit Theorem transformation, try to use logical subsets such as all the points in a shift. The key is that they are from the same time period.