Skip to main content

Designed Experiments; Latent Variables (PCA, PLS, multivariate methods with missing data); Process Monitoring; Batch data analysis.

Project description

Process Improvement using Data

A Python package for multivariate data analysis, designed experiments, and process monitoring. Companion to the online textbook Process Improvement using Data.

Installation

pip install process-improve

Quick Start

PCA — Principal Component Analysis

import pandas as pd
from process_improve.multivariate.methods import PCA, MCUVScaler

# Load and scale your data
X = pd.read_csv("your_data.csv", index_col=0)
scaler = MCUVScaler().fit(X)
X_scaled = scaler.transform(X)

# Fit a PCA model
pca = PCA(n_components=3).fit(X_scaled)

# Inspect results
print(pca.scores_)  # Score matrix (N x A)
print(pca.loadings_)  # Loading matrix (K x A)
print(pca.r2_cumulative_)  # Cumulative R² per component

# Detect outliers
outliers = pca.detect_outliers(conf_level=0.95)

# Contribution analysis
contrib = pca.score_contributions(pca.scores_.iloc[0].values)

# Select number of components via cross-validation
result = PCA.select_n_components(X_scaled, max_components=10)
print(result.n_components)

# Built-in plots
pca.score_plot()
pca.spe_plot()
pca.t2_plot()
pca.loading_plot()

PLS — Projection to Latent Structures

from process_improve.multivariate.methods import PLS, MCUVScaler

# Scale X and Y separately
scaler_x = MCUVScaler().fit(X)
scaler_y = MCUVScaler().fit(Y)

# Fit a PLS model
pls = PLS(n_components=3).fit(scaler_x.transform(X), scaler_y.transform(Y))

# Inspect results
print(pls.scores_)  # X scores (N x A)
print(pls.beta_coefficients_)  # Regression coefficients (K x M)
print(pls.r2_cumulative_)  # Cumulative R² for Y

# Predict new observations
result = pls.predict(scaler_x.transform(X_new))
print(result.y_hat)  # Predicted Y values
print(result.spe)  # SPE for new data
print(result.hotellings_t2)  # Hotelling's T² for new data

# Detect outliers and analyze contributions
outliers = pls.detect_outliers(conf_level=0.95)
contrib = pls.score_contributions(pls.scores_.iloc[0].values)

Features

  • PCA with SVD, NIPALS, and missing data (TSR) algorithms
  • PLS regression with sklearn-compatible API
  • TPLS (Total PLS) for multi-block data
  • Missing data handling via TSR and NIPALS algorithms
  • Outlier detection combining Hotelling's T² and SPE with robust ESD test
  • Score contributions for variable-level diagnostics
  • Cross-validation for component selection (PRESS with Wold's criterion)
  • Interactive plots (Plotly) for scores, loadings, SPE, and T²
  • Designed experiments — full factorial, fractional factorial, response surface
  • Process monitoring — Shewhart, CUSUM, EWMA control charts
  • Batch data analysis — alignment, feature extraction, multivariate batch monitoring

API Design

Both PCA and PLS follow sklearn conventions:

  • Fitted attributes end with _ (e.g., scores_, loadings_, spe_)
  • fit() returns self
  • predict() returns a Bunch object with named fields
  • score() is compatible with sklearn.model_selection.cross_val_score
  • Works with pandas.DataFrame inputs (preserves index and column names)

Documentation

Build the documentation locally:

cd docs
make html

License

MIT License. See LICENSE for details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

process_improve-1.1.0.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

process_improve-1.1.0-py3-none-any.whl (3.4 MB view details)

Uploaded Python 3

File details

Details for the file process_improve-1.1.0.tar.gz.

File metadata

  • Download URL: process_improve-1.1.0.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for process_improve-1.1.0.tar.gz
Algorithm Hash digest
SHA256 8f803250a9d14464028347e8e1bdaad9ab4d217d9d07dcf1ad090616eeba46c7
MD5 2f0f97ed002083959045fa186c06ab66
BLAKE2b-256 a526e6a7f0efb4753b0df738ceba65844bf92e89e905acfb29e9d78d337b8e61

See more details on using hashes here.

File details

Details for the file process_improve-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: process_improve-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for process_improve-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0cf750e8408fa1732731c9f920c70fcc8843e234b3c3959be624ddd0236ffeeb
MD5 8207a847256a6e236a9a0e17f64b937e
BLAKE2b-256 ce1c8cd3d3da6b913e1788bdf92950a20afdb4dce2e0a0743260c23f1b927291

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page