QP Stats Flashcards
When do you see statistics used as a QP and how are they useful to you?
Method validation
Process validation
Method transfer
Trending (environmental, finished product, stability, analysis (QC), process (production). KPIs)
Stability studies
Describe a T-test?
confidence levels (analysis, e.g. assay) of a mean – systematic errors
comparing an experimental mean with a known value – significance test
comparing means of two methods – e.g. method transfer, new method, for systematic errors – significance test) – need pooled std deviation
using the standard error of the mean (SEM) to determine confidence levels of 95% - construct the equation for 95% confidence level (obtain t value from table) and check the null hypothesis (the two means are equal hence the methods give the same results) (NB: 95% confidence level that 95% of the sample means will lie in the range)
What statistics do you use in comparing 2 sets of results from two separate laboratories?
T Test
Regression analysis
Model indicators
– Residual standard deviation
* Smaller the better
– Coefficient of determination (R2)
* Amount of variability explained by model; and
* Between 0 and 1, closest to 1 is best
– Lack-of-fit:
* How well the model fits the data
Confidence intervals
Typically 95% confidence limits are calculated and plotted over the scatter diagram
– Essential to show the uncertainty about the position of the line and for estimation purposes
– Help to identify potential outliers
* Based on the t-distribution
– Produces ‘bowed’ limits
– May be approximated by normal distribution values if sample size is large
What statistics you would expect to see as a QP? Focus on data evaluation in GMP (ICH Q8-11, Chapter 1&6, Annex 15)
Process control charts
Trending of test results to identify variation. Use this to identify cause of variation (fishbone) & improve process (pareto).
Shewart (more useful for observing large/abrupt changes) or CUSUM (more sensitive to small/moderate changes) charts.
Use of control charts to set up in house alert / action limits based on historical trends of > 30 data sets using 2/3 sigma.
Process capability – Cp and Cpk, understand how capable process is - capable process Cp/Cpk value of > 1.33.
Equivalence Testing – show data sets are equivalent or drugs are bioequivalent: T tests, ANOVA, F2 test for disso profiles (SUPAC >50)
Stability Testing (ICH Q1)
Regression analysis to predict shelf life of product (least squares method) or ID trends which may go OOS before end of shelf life
Y axis amount of characteristic, X axis time. Determine 2 or 1-sided 95% confidence limit. Determine X value for specified Y amount = shelf life.
QC Testing
Laboratory Investigations: Outlier testing – Dixons or Grubbs
Analytical Method Validation
Method Transfer: Assay, CU, Impurities – T-test, Dissolution profile – f2 value (similarity factor: target 50-100)
Sampling of Starting/Packaging materials (Annex 8 – risk based)
Acceptable Quality Level (AQL) Sampling ISO 2859 - (Attribute sampling)
Assumes defects are randomly distributed
Sampling provides a high probability of Accepting Good Product and detecting defective products as a process average.
Case by case, based on sample size and inspection level. General Inspection Level II = standard. Can switch between normal/reduced/tightened.
Critical, major, minor defects need to be classified, each with accept/reject levels.
Confidence intervals
Confidence intervals = error bounds on estimates
95% confidence interval gives an interval in which you can say with 95% confidence that it contains the true value.
T tests
Used for comparing samples & data sets – Normally distributed data only!
Hypothesis testing: the use of statistics to determine the probability that a given hypothesis is true.
Null hypothesis = are they equal?
Alternative hypothesis = null hypothesis is false
P = calculated probability (that data observed happened by chance)
Significance level is chosen (usually 0.05)
if P < 0.05, null hypothesis is rejected at 95% confidence (not probable it is by chance).
T-TEST: Test if 2 samples have the same mean
F-TEST: Check variability, compare std dev of 2 samples
ANOVA
3 or more data sets
Compares means for difference amongst groups by comparing sources of variation
Visualising data
Symmetrical Data:
Mean and standard deviation
Skew Data:
Median and Semi Interquartile Range (SIR)
Examples:
Box Plots
Histograms
Pareto Charts
80:20 rule (20% cause 80% of the problems
Plot frequency chart and apply weighing based on risk – determine where to focus effort
Control charts
Used to look at two types of variation (ideally >30 data sets), track trends, picture of process, early warning, assess changes
Common cause (random) - IN CONTROL - inherent in a process, small cumulative effect that’s hard to reduce / remove. E.g. machine wear, material variation, punch / die tolerances, press speed
Special cause - OUT OF CONTROL - irregular occurrences, rare, larger impact, easier to correct E.g. machine breakdown, change of supplier, punch / die damage, incorrect set up of press
CUSUMS
plots cumulative sum of differences from target value - magnifies gradual change & easier to identify significant change. If in control get random noise around ‘0’.
Very sensitive to change
Quicker at detecting changes for average run lengths
Use V Mask to check for significant change
CUSUM useful for detection of small to moderate changes
Shewhart control chart more effective for detection of large or abrupt changes and irregular patterns.
Special cause rules – Shewhart chart
Rule 1 - any point beyond control limits
Rule 2 - Run of 7 points above or below the mean
Rule 3 - Run of 7 points increasing or decreasing
Rule 4 - Unusual pattern e.g. cyclic
Define in SOP which rules apply.
The more you apply, more likely to get false failures.
Type 1 Error: False positive signals
Type 2 Error: False negative signals
Process capability
Looks at spread of results in relation to specification limits. Confirms process is capable of continually meeting specification - must eliminate ‘special cause’ variation first, if system is ‘out of control’ CP and CPk cannot be used to predict future performance.
Pp and PPk used for Process Performance.
Usedwhen process is too new to determine if under statistical control/process out of control.
PPK tell us how a process has performed in the past.
Can’t use it predict the future because the process is not in a state of control.
Cp and Cpk used for Process Capability after a process has reached stability or statistical control
CP: process capability wrt target - gives indication of process spread, capability of producing within specification.
Spec and std dev only.
CPk: process capability index - gives indication of spread of data around mean / how centred process is / how close process is to the mean / how far process is from target / capability of process. Mean and std dev.
Cpktells us what a process is capable of doing in future, assuming it remains in a state of statistical control.
Can have in control (ref. control chart) but not capable (ref. Cp/Cpk) and capable but not in control – Target in control and capable
CP or CPk of < 1 = Process not capable
CP or CPk of 1 - 1.33 = Process marginal (1 3 sigma)
CP or CPk of > 1.33 = Process is capable (4 sigma)
Think of the walls of a garage –you have to fit your car in (customer specification limits). If you go past the limits, you will crash, and Customer not happy!
If process has a lot of variation, means the process average is all over the place. Not good for parking a car. To give your parking process the best chance of success you should work on reducing variation and centering process.
If the car is too wide for the garage, nothing you do to centre the process will help. You have to change the dispersion of the process (make the car smaller.)
If car is smaller than the garage, it doesn’t matter if you park it exactly in the middle; it will fit and you have plenty of room on either side. That’s one of the reasons the six sigma philosophy focuses on removing variation in a process.
A process in control and with little variation, able to park the car easily within the garage and meet customer spec.
Cpktells you the relationship between size of car, size of garage and how from the middle of garage you parked the car.