Designed Experiments; Latent Variables (PCA, PLS, multivariate methods with missing data); Process Monitoring; Batch data analysis.

These details have not been verified by PyPI

Project links

Project description

process-improve

Multivariate analysis, designed experiments, and process monitoring for Python. Built for chemometrics, manufacturing, and pharma data - the methods that scikit-learn skips.

New here? The architecture overview (source) is the map of the codebase - package layout, the estimator stack, and the MCP tool layer.

What it does

process-improve provides production-grade implementations of the methods practitioners actually use on real plant and lab data:

PCA with SVD and NIPALS, plus native missing-value handling via Trimmed Score Regression
PLS regression with a fully sklearn-compatible API, VIP scores, and cross-validated diagnostics
TPLS - PLS for T-shaped (multi-block) data structures
Outlier detection combining Hotelling's T² and SPE with an ESD-based test
Designed experiments - full-factorial, fractional-factorial, and response-surface designs, plus a multi-stage DOE strategy recommender
Process monitoring - Shewhart, CUSUM, and Holt-Winters control charts
Batch data analysis - alignment, feature extraction, and multivariate batch monitoring (MBPCA / MBPLS)
Interactive Plotly diagnostics bound directly to every fitted model

Outputs are pandas-native: scores, loadings, and predictions keep your row and column labels.

Scale: the estimators are in-memory - they assume the (scaled) data matrix fits in RAM, plus a couple of working copies during fit. A float64 matrix needs about rows x cols x 8 bytes (e.g. 10M x 200 is ~16 GB). For the practical limits and guidance on larger-than-RAM data, see Scaling and memory (source).

It is the companion package to the online textbook Process Improvement using Data, and powers the statistical engine behind factori.al.

Why not scikit-learn?

scikit-learn answers "what fits the data?" - process-improve answers "is this batch normal, which variable went off, and how confident am I in the prediction?" The two libraries are designed to be used together; process-improve follows sklearn conventions (fit, predict, score, the _ suffix on fitted attributes) and drops into existing pipelines.

Capability	scikit-learn	process-improve
PCA, PLS with sklearn-style API	✓	✓
Missing-data fitting (NIPALS / TSR)	-	✓
Hotelling's T² + SPE outlier limits	-	✓
Variable-level score contributions	-	✓
Cross-validated coefficient confidence intervals	-	✓
Multi-block models (TPLS)	-	✓
Designed experiments (DoE)	-	✓
Control charts (Shewhart / CUSUM / Holt-Winters)	-	✓
Batch process monitoring (MBPCA / MBPLS)	-	✓
Plotly diagnostics built in	-	✓
Labeled `DataFrame` outputs	partial	✓

Installation

pip install process-improve                    # core (numpy, pandas, sklearn, statsmodels, patsy, pydantic, pyyaml, tqdm)
pip install 'process-improve[plotting]'        # adds matplotlib, plotly, seaborn, ridgeplot
pip install 'process-improve[expt]'            # adds pyDOE3 (designed experiments / DOE)
pip install 'process-improve[batch]'           # adds openpyxl, scikit-image (batch process data IO)
pip install 'process-improve[mcp]'             # adds the MCP server runtime
pip install 'process-improve[fast]'            # adds numba (JIT speedups for batch alignment)
pip install 'process-improve[all]'             # everything above (the pre-1.24.11 closure)

Requires Python 3.10 or newer. The core install pulls in numpy, pandas, scipy, scikit-learn, statsmodels, patsy, pydantic, pyyaml, and tqdm. Heavier optional surfaces (plotting, designed experiments, batch IO, MCP server, numba JIT) live in extras so a caller who only needs, say, detect_multivariate_outliers does not have to install Plotly or numba.

Quick start

PCA - Principal Component Analysis

import pandas as pd
from process_improve.multivariate.methods import PCA, MCUVScaler

X = pd.read_csv("your_data.csv", index_col=0)
X_scaled = MCUVScaler().fit_transform(X)

pca = PCA(n_components=3).fit(X_scaled)
print(pca.r2_cumulative_)         # cumulative R² per component
pca.score_plot()                  # interactive Plotly figure

# Flag outliers using combined T² and SPE limits at 95% confidence
outliers = pca.detect_outliers(conf_level=0.95)

# Which variables drove the first observation off?
contrib = pca.score_contributions(pca.scores_.iloc[0].values)

PLS - Projection to Latent Structures

from process_improve.multivariate.methods import PLS, MCUVScaler

# Scale X and Y separately
scaler_x = MCUVScaler().fit(X)
scaler_y = MCUVScaler().fit(Y)
X_s, Y_s = scaler_x.transform(X), scaler_y.transform(Y)

pls = PLS(n_components=3).fit(X_s, Y_s)
print(pls.beta_coefficients_)     # regression coefficients (K x M)
print(pls.r2_cumulative_)         # cumulative R² for Y
print(pls.vip())                  # VIP scores per X variable

# Predict new observations (sklearn-compatible: returns just y_hat)
y_pred = pls.predict(scaler_x.transform(X_new))

# Predict with full per-row diagnostics (scores, T², SPE, plus y_hat)
result = pls.diagnose(scaler_x.transform(X_new))
result.y_hat                      # point predictions
result.spe                        # squared prediction error
result.hotellings_t2              # Hotelling's T² for new observations

# Cross-validated component selection
cv_select = PLS.select_n_components(X_s, Y_s, max_components=6)
print(cv_select.n_components)     # recommended number of components
print(cv_select.rmsecv)           # RMSECV per component count

# Cross-validation with beta-coefficient confidence intervals
cv = pls.cross_validate(X_s, Y_s, cv="loo")
print(cv.beta_ci_lower, cv.beta_ci_upper)   # 95% CI for each beta
print(cv.significant)                       # betas significantly != 0
print(cv.q_squared)                         # cross-validated R² (Q²)

DOE - multi-stage experimental strategy

from process_improve.experiments.factor import Factor, Response
from process_improve.experiments.strategy import recommend_strategy

factors = [
    Factor(name="Temperature", low=25, high=40, units="degC"),
    Factor(name="pH", low=5.0, high=7.5),
    Factor(name="Glucose", low=10, high=50, units="g/L"),
]
strategy = recommend_strategy(
    factors=factors,
    responses=[Response(name="Yield", goal="maximize", units="g/L")],
    budget=40,
    domain="fermentation",
)
for s in strategy["stages"]:
    print(s["stage_number"], s["design_type"], s["estimated_runs"])

Longer, fully-worked versions of each example live in the Quickstart guide and the process_improve/notebooks_examples/ folder.

New to designed experiments? The Applied DoE tutorial is an eight-module worked-solution series.

API design

PCA and PLS follow scikit-learn conventions: fit() returns self, fitted attributes end with a trailing underscore (scores_, loadings_, spe_, hotellings_t2_, r2_cumulative_, ...), and predict() returns an sklearn.utils.Bunch with named fields (y_hat, spe, hotellings_t2, ...). Inputs are accepted as pandas.DataFrame, and index/column labels are preserved through fit and transform.

Documentation & learning resources

API reference & user guide: https://kgdunn.github.io/process-improve/
Applied DoE tutorial (8 modules): https://kgdunn.github.io/process-improve/applied_doe/index.html
Companion textbook: Process Improvement using Data
Hosted experiment-design tool: factori.al
Local docs build: cd docs && make html

Citing process-improve

If you use this package in academic work, please cite it:

@software{dunn_process_improve,
  author  = {Dunn, Kevin G.},
  title   = {{process-improve: Multivariate Analysis for Process Improvement}},
  year    = {2026},
  version = {v1.21.4},
  url     = {https://github.com/kgdunn/process-improve}
}

A CITATION.cff file is included, so GitHub renders a "Cite this repository" button in the sidebar.

Contributing

Bug reports, feature requests, and pull requests are welcome. See CONTRIBUTING.md for development setup, testing, and code style. Bugs and feature requests can be filed on the issue tracker.

License

MIT - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.47.0

Jun 22, 2026

1.46.0

Jun 21, 2026

1.44.0

Jun 20, 2026

1.42.0

Jun 19, 2026

1.40.2

Jun 12, 2026

This version

1.40.1

Jun 11, 2026

1.39.0

Jun 7, 2026

1.38.0

Jun 7, 2026

1.27.1

Jun 6, 2026

1.24.9

Jun 2, 2026

1.24.8

Jun 2, 2026

1.24.7

Jun 2, 2026

1.24.6

Jun 2, 2026

1.24.5

Jun 2, 2026

1.24.4

Jun 2, 2026

1.24.3

Jun 2, 2026

1.24.2

Jun 2, 2026

1.24.1

Jun 2, 2026

1.24.0

Jun 2, 2026

1.23.2

Jun 2, 2026

1.23.1

Jun 2, 2026

1.23.0

Jun 2, 2026

1.22.21

Jun 2, 2026

1.22.20

Jun 2, 2026

1.22.19

Jun 2, 2026

1.22.18

Jun 2, 2026

1.22.17

Jun 1, 2026

1.22.16

Jun 1, 2026

1.22.15

Jun 1, 2026

1.22.14

Jun 1, 2026

1.22.13

Jun 1, 2026

1.22.12

Jun 1, 2026

1.22.11

Jun 1, 2026

1.22.10

Jun 1, 2026

1.22.9

May 30, 2026

1.22.8

May 29, 2026

1.22.7

May 29, 2026

1.22.6

May 29, 2026

1.22.5

May 29, 2026

1.22.4

May 29, 2026

1.22.3

May 27, 2026

1.22.2

May 22, 2026

1.22.1

May 21, 2026

1.22.0

May 21, 2026

1.21.7

May 18, 2026

1.21.6

May 17, 2026

1.21.5

May 17, 2026

1.21.4

May 17, 2026

1.21.3

May 17, 2026

1.21.2

May 17, 2026

1.21.1

May 17, 2026

1.21.0

May 17, 2026

1.20.0

May 17, 2026

1.19.0

May 17, 2026

1.18.0

May 17, 2026

1.17.0

May 17, 2026

1.16.12

May 17, 2026

1.16.11

May 17, 2026

1.16.10

May 17, 2026

1.16.9

May 16, 2026

1.16.8

May 14, 2026

1.16.7

May 14, 2026

1.16.6

May 12, 2026

1.16.5

May 12, 2026

1.16.4

May 12, 2026

1.16.3

May 12, 2026

1.16.2

May 12, 2026

1.16.1

May 11, 2026

1.16.0

May 11, 2026

1.15.3

May 11, 2026

1.15.2

May 11, 2026

1.15.1

May 11, 2026

1.15.0

May 10, 2026

1.14.0

May 10, 2026

1.13.12

May 7, 2026

1.13.11

May 6, 2026

1.13.9

May 6, 2026

1.13.8

May 6, 2026

1.13.7

May 6, 2026

1.13.6

May 3, 2026

1.13.5

Apr 29, 2026

1.13.4

Apr 29, 2026

1.13.3

Apr 29, 2026

1.13.2

Apr 29, 2026

1.13.1

Apr 29, 2026

1.13.0

Apr 29, 2026

1.9.3

Apr 28, 2026

1.9.2

Apr 28, 2026

1.9.1

Apr 28, 2026

1.9.0

Apr 28, 2026

1.8.1

Apr 28, 2026

1.8.0

Apr 27, 2026

1.7.1

Apr 27, 2026

1.7.0

Apr 27, 2026

1.6.2

Apr 27, 2026

1.6.1

Apr 25, 2026

1.6.0

Apr 23, 2026

1.5.1

Apr 21, 2026

1.5.0

Apr 21, 2026

1.4.1

Apr 19, 2026

1.4.0

Apr 19, 2026

1.3.3

Apr 17, 2026

1.3.2

Apr 16, 2026

1.3.1

Apr 15, 2026

1.2.8

Apr 14, 2026

1.2.7

Mar 25, 2026

1.2.6

Mar 25, 2026

1.2.5

Mar 25, 2026

1.2.0

Mar 23, 2026

1.1.0

Mar 16, 2026

1.0.0

Mar 14, 2026

0.9.99

Jan 24, 2026

0.9.98

Jan 23, 2026

0.9.97

Jan 22, 2026

0.9.96

Sep 18, 2025

0.9.95

Jul 15, 2025

0.9.94

Jul 10, 2025

0.9.93

Jul 9, 2025

0.9.92

Jul 8, 2025

0.9.92rc1 pre-release

Jul 8, 2025

0.9.91

Jul 8, 2025

0.9.90

Jul 7, 2025

0.9.89

Jul 4, 2025

0.9.88

Jul 1, 2025

0.9.87

Jun 29, 2025

0.9.86

Jun 29, 2025

0.9.85

May 16, 2025

0.9.84

Mar 11, 2025

0.9.83

Mar 11, 2025

0.9.82

Mar 11, 2025

0.9.81

Mar 3, 2025

0.9.80

Feb 27, 2025

0.9.78

Feb 7, 2025

0.9.76

Feb 7, 2025

0.9.75

Oct 17, 2024

0.9.74

Sep 10, 2024

0.9.73

Aug 4, 2024

0.9.72

Aug 4, 2024

0.9.71

Apr 22, 2024

0.9.70

Mar 26, 2024

0.9.66

Aug 24, 2023

0.9.65

May 18, 2023

0.9.64

Sep 16, 2022

0.9.63

Aug 31, 2022

0.9.62

Aug 31, 2022

0.9.60

May 16, 2022

0.9.59

Apr 12, 2022

0.9.58

Feb 22, 2022

0.9.57

Feb 22, 2022

0.9.56

Feb 3, 2022

0.9.55

Jan 31, 2022

0.9.54

Oct 10, 2021

0.9.53

Oct 9, 2021

0.9.52

Sep 28, 2021

0.9.51

Sep 27, 2021

0.9.50

Sep 27, 2021

0.9.49

Sep 27, 2021

0.9.48

Sep 27, 2021

0.9.47

Sep 25, 2021

0.9.46

Sep 14, 2021

0.9.45

Aug 31, 2021

0.9.44

Aug 31, 2021

0.9.43

Aug 30, 2021

0.9.42

Aug 17, 2021

0.9.41

Aug 17, 2021

0.9.40

Aug 16, 2021

0.9.39

Aug 16, 2021

0.9.38

Aug 16, 2021

0.9.37

Aug 13, 2021

0.9.36

Aug 13, 2021

0.9.35

Aug 13, 2021

0.9.34

Aug 13, 2021

0.9.33

Aug 12, 2021

0.9.32

Aug 12, 2021

0.9.31

Aug 12, 2021

0.9.30

Aug 12, 2021

0.9.29

Aug 12, 2021

0.9.28

Aug 10, 2021

0.9.27

Aug 10, 2021

0.9.26

Aug 9, 2021

0.9.25

Aug 2, 2021

0.9.24

Jul 21, 2021

0.9.23

Jul 21, 2021

0.9.22

Jul 21, 2021

0.9.21

Jun 28, 2021

0.9.20

Jun 28, 2021

0.9.19

Jun 28, 2021

0.9.18

Jun 27, 2021

0.9.17

Jun 27, 2021

0.9.16

Jun 14, 2021

0.9.14

Jun 5, 2021

0.9.13

Jun 5, 2021

0.9.12

Jun 4, 2021

0.9.11

Jun 4, 2021

0.9.0

Jun 4, 2021

0.8.8

Apr 27, 2021

0.8.7

Apr 26, 2021

0.8.6

Apr 26, 2021

0.8.4

Apr 20, 2021

0.8.3

Apr 19, 2021

0.8.1

Apr 8, 2021

0.8.0

Apr 6, 2021

0.7.9

Mar 31, 2021

0.7.7

Mar 24, 2021

0.7.6

Mar 24, 2021

0.7.5

Mar 19, 2021

0.7.3

Mar 15, 2021

0.7.2

Mar 15, 2021

0.7.1

Mar 4, 2021

0.7.0

Mar 4, 2021

0.6.9

Nov 28, 2019

0.6.8 yanked

Nov 28, 2019

0.6.5

Nov 21, 2019

0.6.4

Nov 21, 2019

0.6.3

Nov 20, 2019

0.6.2

Nov 20, 2019

0.6.1

Nov 20, 2019

0.5.8

Nov 6, 2019

0.5.7

Nov 6, 2019

0.5.6

Nov 6, 2019

0.5.5

Nov 6, 2019

0.5.4

Nov 6, 2019

0.5.3

Nov 6, 2019

0.5.2

Oct 29, 2019

0.5.1

Oct 24, 2019

0.5.0

Oct 23, 2019

0.4.9

Oct 23, 2019

0.4.8

Oct 23, 2019

0.4.7

Oct 23, 2019

0.4.6

Oct 17, 2019

0.4.5

Oct 17, 2019

0.4.4

Oct 17, 2019

0.4.3

Oct 10, 2019

0.4.2

Oct 10, 2019

0.4.1

Oct 10, 2019

0.4.0

Oct 9, 2019

0.3.5

Oct 8, 2019

0.3.3

Oct 3, 2019

0.3.1

Oct 3, 2019

0.3.0

Oct 3, 2019

0.2.8

Oct 3, 2019

0.2.6

Oct 3, 2019

0.2.5

Oct 3, 2019

0.2.3

Oct 3, 2019

0.2.2 yanked

Oct 3, 2019

Reason this release was yanked:

Out of date

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

process_improve-1.40.1.tar.gz (1.8 MB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

process_improve-1.40.1-py3-none-any.whl (1.9 MB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file process_improve-1.40.1.tar.gz.

File metadata

Download URL: process_improve-1.40.1.tar.gz
Upload date: Jun 11, 2026
Size: 1.8 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for process_improve-1.40.1.tar.gz
Algorithm	Hash digest
SHA256	`4deee27b580a1ec7c159487490b51ba433c9fbc4ac3e83220d519b9a893b5953`
MD5	`138bba7cc00cee912a5f9ff212c5c4df`
BLAKE2b-256	`6210dc2724d00fb01378ceeb7cfb405c8434535fbebc1edfaef69bcdc7eb9a1f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for process_improve-1.40.1.tar.gz:

Publisher: publish.yml on kgdunn/process-improve

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: process_improve-1.40.1.tar.gz
- Subject digest: 4deee27b580a1ec7c159487490b51ba433c9fbc4ac3e83220d519b9a893b5953
- Sigstore transparency entry: 1789704703
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: kgdunn/process-improve@a88bac292ca8979a991b3af68c11c6d1c46cb8bf
- Branch / Tag: refs/heads/main
- Owner: https://github.com/kgdunn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a88bac292ca8979a991b3af68c11c6d1c46cb8bf
- Trigger Event: workflow_dispatch

File details

Details for the file process_improve-1.40.1-py3-none-any.whl.

File metadata

Download URL: process_improve-1.40.1-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 1.9 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for process_improve-1.40.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4d00e55ce1cf03431d4da991e662471ffc27fd062ea60a4de1c549762cb8b7c`
MD5	`da48aa5c45f5bc3ff73fdff1b14bc415`
BLAKE2b-256	`8a2eaad0b2c84dfa05dda2a95537db6bd3265c65c6c866c4242c7c5fafd03a16`

See more details on using hashes here.

Provenance

The following attestation bundles were made for process_improve-1.40.1-py3-none-any.whl:

Publisher: publish.yml on kgdunn/process-improve

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: process_improve-1.40.1-py3-none-any.whl
- Subject digest: d4d00e55ce1cf03431d4da991e662471ffc27fd062ea60a4de1c549762cb8b7c
- Sigstore transparency entry: 1789704722
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: kgdunn/process-improve@a88bac292ca8979a991b3af68c11c6d1c46cb8bf
- Branch / Tag: refs/heads/main
- Owner: https://github.com/kgdunn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a88bac292ca8979a991b3af68c11c6d1c46cb8bf
- Trigger Event: workflow_dispatch

process-improve 1.40.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

process-improve

What it does

Why not scikit-learn?

Installation

Quick start

PCA - Principal Component Analysis

PLS - Projection to Latent Structures

DOE - multi-stage experimental strategy

API design

Documentation & learning resources

Citing process-improve

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance