Collecting Data & Baseline Statistics Flashcards
Check Sheet
- A data collection tool that usually identified where and how often problems appear in a product or service.
- Blank form used for systematic collecting and organizing data on a factual time basis at the location where the data is generated.
- Both attribute and variable data can be recorded on a check sheet
- Data can be qualitative or quantitative
- For Quantitative data it is often called a tally sheet
- Data collected can be used as input for other quality tools such as histograms, bar graph, Pareto chart
When Would We Use a Check Sheet in a Six Sigma Project?
Check Sheets are a highly proven method in many industries for business process improvements
- Use check sheets when data is collecting manually
- Use check sheets to capture the shift, machine, and operator production data
- When collecting frequency data and identifying patterns of rejection, defect location, and defect causes
- To capture product and process-related parameters to ensure the quality
Different Types of Check Sheets
- Production process distribution check sheet
- Defect location check sheet
- Defect cause check sheet
- Defective items check sheet
- Check-up confirmation sheet nothing but a checklist
Check Sheet Benefits
- To provide documentary evidence of data collection
- Help to create a histogram, bar graph, and Pareto chart
- Monitor the process performance and also easy to use
- To provide a base for future reference
- To provide a base for future reference
- Quantify defects by type, location, and use
- Helps to identify the frequency of the problem occurring
Checklist vs. Check Sheet
- Check sheet is one of the 7 QC process improvement tools used for capturing and categorizing the observations. While the Checklist is a mistake-proofing aid to ensure all the important steps or actions have been taken, especially when checking or auditing the process outputs.
- Check sheet is used to collect data, where as a checklist is used to mark the accomplishment
Ensuring Data Accuracy & Integrity
- Date should not be removed from a set without an appropriate statistical test or logic
- Generally, data should be recorded in time sequence
- Unnecessary rounding should be avoided
- If done, should be late in the process
- Screen the data to remove entry errors
- Avoid emotional bias
- Record measurements of items that change over time as quickly as possible after manufacture and again after the stabilization period
- Each important classification identification should be recorded alongside the date
- Example: Time, machine, operator, gage, lab, material, conditions, etc.
Coding Data
- Sometimes it is more effective to code data by adding, subtracting, multiplying or dividing by a factor
-
Types of Data Coding
- Substitution: ex. replace 1/8th of an inch with + / 1 deviations from center in integers
- Truncation: ex. data set of 0.5541, 0.5542, 0.5547 – you might just remove the 0.554 portions.
-
Types of Data Coding
Problems due to NOT Coding
- Practitioner tries to squeeze too many numbers on a form - poor usability and legibility
- Increased errors in data entry
- Insensitivity of analytics due to rounding
Effects of Coding Data
- Will affect the mean to the extend that the mean must be uncoded or reporting purposes
- Coding and uncoding of the mean will be exactly the opposite
- Example: Add X, subtract X, or multiply by X, divide by X
- The effect the coding has on standard deviation depending on how the data is coded
68-95-99.7 Rule
- Also known as the Empirical Rule
- Used to remember the percentage of values that lie within a band around the mean in a normal distribution with a width of one, two, and three standard deviations respectively.
Why would you want your baseline sigma to be 1, 2, or 3?
- Those are indicative of bad process and you would like your team to be able to see an improvement in the process at the end of the project
How to determine baseline project sigma for Discrete Data?
Calculate the process capability is through the number of defects per opportunity. The acceptable number to achieve six sigma is 3.4 defects Per Million Opportunities (DPMO)
- Unit - the item produced or processed or created
- Defect - anything that cases a failure (i.e. misses the customer’s requirements
- Opportunity - the number of critical to quality measures we are counting on each opportunity in defect. If there are 4 types of defects, this value is 4
- DPO = Defects/(Units * Opportunity)
- DPMO = (Defects/Units * Opportunities) Total 1,000,000
- Yield = 1-DPO (It is the ability of the process to produce defect free units)
Determine if Zero defects are needed or if there is partial credit.
- If the process is only considered correct if there are no defects at all (100% correct), then use the DPMU calculation (defects per million units) DPMU = (Defects / Units) * 1,000,000
- If partial credit is received for meeting some of the requirements: use the DPMO calculation (defects per million opportunities) DPMO = (Defects / Units * Opportunities) * Total 1,000,000
*Examples of Baseline Sigma for Discrete Data
XYZ is a commercial flight carrier operating 10,000 flights a day. There are three defect opportunities like late arrival, lost luggage, and poor in-flight experience. Let’s assume 10,000 defects identified. Calculate process sigma level
- Unit or sample size = 10,000 flights a day
- Defect types = 3 (could be late arrival, lost luggage, poor in-flight experience)
- Opportunities = 10,000 flights * 3 kinds of defect opportunities = 3,000
- Defects: 10,000 defects
- DPMO = (10,000/10,000*3) 1,000,000 = (⅓) 1M = 333,333 defects per million opportunities
- From the attached chart, 333,333 DPMO translates to a sigma between 1.95 and 1.9
How to Determine Baseline Project Sigma for Continuous Data?
-
Process Capability is the determination of the adequacy of the process with respect to the customer needs.
- Compares the output of an in-control process to the specification limits.
- We can say the process is capable when almost all the measurements fall within the specification limits.
- Cp and Cpk are considered short-term potential capability measures for a process
Cpk is a measure to show how many standard deviations the specification limits are from the center of the process.
- Cplower = (Process Mean – LSL)/(3*Standard Deviation)
- Cpupper = (USL – Process Mean)/(3*Standard Deviation)
- Cpk is smallest value of the Cpl or Cpu: Cpk= Min (Cpl, Cpu)
The main purpose of Cpk is to determine how close a process is performing when compared to its specification limits and considering the natural variability of the process. Always larger Cpk is better, it indicates the less probability of any item will be outside the specification limits.
Process sigma = 3* Cpk. Hence We generally want a Cpk of at least 1.33 [4 sigma] or higher to satisfy most customers.
Examples of Baseline Sigma for Continuous Data
- The specification limits of rubber sheet is 5±1cm. Operator randomly recorded 4 subgroups of values every half an hour in three shifts. While the average mean is 4.7 and short term pool standard deviation is 0.2. Calculate the process sigma level.
- USL = 6cm
- LDL = 4cm
- Standard deviation σR= 0.2 cm
- Process mean = 4.7 cm
- Cplower = (Process Mean – LSL)/(3*Standard Deviation) = 1.16
- C*pupper = (USL – Process Mean)/(3*Standard Deviation) = 2.16
- Cpk = min (Cpu, Cpl) = 1.16
- Sigma = 3* 1.16 = 3.5
What is a Measles Chart?
- Also known as Defect Location Check Sheet
- Visual tool for collecting data
- Commonly used to record location data and also shows the problems in a geographic area
- Use Measles charts specifically to analyze the problem’s location and density, not just collecting the count of the problems.
- Helps determine where the common defects on parts are located
- Example:
- A big red x on the engine area on a car image given to you by AAA to show you need to change your oil.