Skip to main content

A Python package to facilitate building process data science solutions including process modeling, monitoring, fault diagnosis, etc.

Project description

PDStoolkit

Table of Contents

  1. Project Description
  2. Documentation & Tutorials
  3. Package Contents
  4. Installation
  5. Usage

Description

The PDStoolkit (Process Data Science Toolkit) package has been created to provide easy-to-use modules to help quickly build data-based solutions for process systems such as those for process monitoring, modeling, fault diagnosis, system identification, etc. Current modules in the package are wrappers around pre-existing Sklearn's classes and provide several additional methods to facilitate a process data scientist's job. Details on these are provided in the following section. More modules relevant for process data science will be added over time.

Documentation and Tutorials

  • PDStoolkit_Manual.pdf (in Github repository) provides some quick information on the algorithms implemented in the package
  • Class documentations are provided in the 'docs' folder in Github (Source Code) repository
  • Tutorials are provided in the 'tutorials' folder in Github (Source Code) repository
  • The blog post (https://mlforpse.com/intro-to-pdstoolkit-python-package/) gives some perspective behind the motivation for development of PDStoolkit package
  • Theoretical and conceptual details on specific algorithms can be found in our book (https://leanpub.com/machineLearningPSE)

Package Contents

The main modules in the package currently are:

  • PDS_PCA: Principal Component analysis for Process Data Science

    • This class is a child of sklearn.decomposition.PCA class
    • The following additional methods are provided
      • computeMetrics: computes the monitoring indices (Q or SPE, T2) for the supplied data
      • computeThresholds: computes the thresholds / control limits for the monitoring indices from training data
      • draw_monitoring_charts: draws the monitoring charts for the training or test data
      • detect_abnormalities: detects if the observations are abnormal or normal samples
      • get_contributions: returns abnormality contributions for T2 and SPE for an observation sample
  • PDS_PLS: Partial Least Squares regression for Process Data Science

    • This class is a child of sklearn.cross_decomposition.PLSRegression class
    • The following additional methods are provided
      • computeMetrics: computes the monitoring indices (SPEx, SPEy, T2) for the supplied data
      • computeThresholds: computes the thresholds / control limits for the monitoring indices from training data
      • draw_monitoring_charts: draws the monitoring charts for the training or test data
      • detect_abnormalities: detects if the observations are abnormal or normal samples
  • PDS_DPCA: Dynamic Principal Component analysis for Process Data Science

    • This class is a child of sklearn.decomposition.PCA class
    • The following additional methods are provided
      • computeMetrics: computes the monitoring indices (Q or SPE, T2) for the supplied data
      • computeThresholds: computes the thresholds / control limits for the monitoring indices from training data
      • draw_monitoring_charts: draws the monitoring charts for the training or test data
      • detect_abnormalities: detects if the observations are abnormal or normal samples
  • PDS_DPLS: Dynamic Partial Least Squares regression for Process Data Science

    • This class is a child of sklearn.cross_decomposition.PLSRegression class
    • The following additional methods are provided
      • computeMetrics: computes the monitoring indices (SPEx, SPEy, T2) for the supplied data
      • computeThresholds: computes the thresholds / control limits for the monitoring indices from training data
      • draw_monitoring_charts: draws the monitoring charts for the training or test data
      • detect_abnormalities: detects if the observations are abnormal or normal samples
  • PDS_CVA: Canonical Variate Analysis for Process Data Science

    • This class is written from scratch
    • The following additional methods are provided
      • computeMetrics: computes the monitoring indices (Ts2, Te2, Q) for the supplied data
      • computeThresholds: computes the thresholds / control limits for the monitoring indices from training data
      • draw_monitoring_charts: draws the monitoring charts for the training or test data
      • detect_abnormalities: detects if the observations are abnormal or normal samples

Instalation

Installation from Pypi:

pip install PDStoolkit

Import modules

from PDStoolkit import PDS_PCA
from PDStoolkit import PDS_PLS

Usage

The following code builds a PCA-based process monitoirng model using PDS-PCA class and uses it for subsequent fault detectiona and fault diagnosis on test data. For details on data and results, see the ProcessMonitoring_PCA notebook in the tutorials folder.

# imports
from PDStoolkit import PDS_PCA

# fit PDS_PCA model
pca = PDS_PCA()
pca.fit(data_train_normal, autoFindNLatents=True)

T2_train, SPE_train = pca.computeMetrics(data_train_normal, isTrainingData=True)
T2_CL, SPE_CL = pca.computeThresholds(method='statistical', alpha=0.01)
pca.draw_monitoring_charts(title='training data')

# fault detectiona and fault diagnosis on test data
pca.detect_abnormalities(data_test_normal, title='test data')
T2_contri, SPE_contri = pca.get_contributions(data_test_normal)

License

All code is provided under a BSD 3-clause license. See LICENSE file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PDStoolkit-0.0.2.tar.gz (17.6 kB view hashes)

Uploaded Source

Built Distribution

PDStoolkit-0.0.2-py3-none-any.whl (26.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page