Cpk NextGen
Project description
Cpk NextGen
Process Capability using Normal distribution mixture
This procedure allows you to compute the $c_p$ and $c_{pk}$ process capability indices, utilizing an innovative method for estimating the center and quantiles of the process, including their uncertainty. The advantage of this approach is its consistency in comparing processes and their changes over time, without relying heavily on anecdotal evidence or having to categorize the “process type”.
Usage
Install the library
python -m pip install cpknextgen
Import the evaluate
function which is the main entry point
from cpknextgen import evaluate
The function takes in 1-dimensional list/numpy array of process data, the tolerances (as mentioned above) and some other parameters for the Gaussian Mixture. Refer to the docstring for further details. It calculates point and interval estimates of the indices, and graphical results - empirical CDF and sample estimate of CDF.
Methodology
The method leverages past performance data and experience. The process of calculation, as outlined in the figure below, can either include or exclude the prior, where the prior is a dataset from past performance used as Bayesian-type information.
The algorithm is designed for continuous process observation, meaning it estimates the resulting indices' value with uncertainty at each point of the process. It can predict what the resulting indices' values will be when the process is complete for a given production period (e.g., a shift, a day, a week, etc.).
Calculation Without Prior
Calculation without the prior is equivalent to estimating the indices on the prior, and the resulting information can be used to calculate the indices on another dataset with this prior. It is especially recommended for “closed” production periods, such as calculating the process capability for a recently concluded shift.
The data is often accompanied by varying amounts of contextual information, most notably the tolerance limits and the extreme limits. These extreme limits are dictated by physical restrictions or plausibility limits and are not mandatory. Any data outside these limits are treated as outliers and ignored. To calculate $c_{pk}$, at least one tolerance limit is necessary. Both tolerance limits are needed for a proper calculation of c_p. If not provided, the algorithm only estimates the quantiles, giving the process center and width, without a tolerance interval for comparison.
Before distribution estimation, data transformation based on shape takes place. This involves the following steps:
- Logarithmic or logit transformations based on extreme limits, when they exist.
- Applying a Yeo-Johnson transformation.
- Scaling the tolerance interval to a +/-1 interval. In cases where one or both tolerances are missing, they are estimated as "tolerance quantiles" from the data.
Calculation With Prior (NOT IMPLEMENTED!)
The data transformation method is derived from the prior. The extent to which the prior is used in distribution estimation varies, depending on the amount of information available at the time of estimation. With limited information, e.g., after the first hour of an 8-hour shift, there is a higher reliance on the past shape of the process from the prior. As the shift progresses, indices will be estimated purely from the information from the ongoing production period.
This balance is controlled by the "Basic sample size" and the "Process Length" parameters. Regardless of the size of the prior, the algorithm ensures the amount of information derived from it corresponds to these two parameters. Hence, it is advisable to use a "sufficiently large" prior dataset that includes all reasonable process variants.
Special Cases
There are two types of special cases that limit the calculation. In the first scenario, no calculation proceeds if there's only one data point or if all data points in the set have the same value. In the second scenario, the calculation proceeds, but it does not produce a prior that can be used for another dataset, e.g., when the lower limit/tolerance isn't given, and all data are above the upper tolerance. These special cases are currently under review, and we look forward to sharing updated methodologies to handle them in the future.
Conclusion
This novel method for computing process capability indices offers a more consistent and data-driven approach. Feedback and contributions are encouraged as we continue to refine and extend this methodology. Please refer to the figure above for a graphical representation of the process.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cpknextgen-1.1.0.tar.gz
.
File metadata
- Download URL: cpknextgen-1.1.0.tar.gz
- Upload date:
- Size: 138.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed28cff14a41392b565ef8eac7efd6e27df1a8e83b9c844bb2de3683441c4357 |
|
MD5 | 581d228b5bdb5a3485579975c776c14e |
|
BLAKE2b-256 | 96e91943ce51bab420fd53ca1b3750fa6f8f4c67649dadc3259f72bc67c11c71 |
File details
Details for the file cpknextgen-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: cpknextgen-1.1.0-py3-none-any.whl
- Upload date:
- Size: 22.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea12b9c784d8e8f488cb349feb6bf16c256209b0d7ed82c95780ffead8f4bad3 |
|
MD5 | 929ec8c7285248e13501fc2fc0df5724 |
|
BLAKE2b-256 | e77d3b2f3496878eb59c48653bea8a5f019048850145a17b3b28cbd41608d077 |