Python Wrappers for Machine Learning

These details have not been verified by PyPI

Project links

Project description

pwml

pwml stands for Python Wrappers for Machine Learning

Requirements

Python >= 3.13
See pyproject.toml for the full dependency list

Installation

pip install pwml

Modules

`classifiers` - Hierarchical Classification

HierarchicalClassifierModel trains a tree of sklearn pipelines, one per node in the label hierarchy. Each node's classifier is selected and tuned independently via GridSearchCV. Inference cascades top-down through the tree.

Features:

Configurable text embedding via sentence-transformers (default: all-MiniLM-L6-v2 384-dim)
One-hot encoding for categorical features
Numeric normalisation to [0, 1] with configurable OOD policy (out_of_range='clip'/'warn'/'raise')
Platt calibration with per-class threshold optimization
Soft routing: descent stops when prediction confidence falls below a configurable threshold
Batch inference via predict_dataframe with pre-computed embeddings
Per-node inference latency profiling (profile=True)
Model versioning metadata embedded in saved artefacts
Evaluation and stratified cross-validation

Training

from pwml.classifiers import hierarchical as hc
from pwml.classifiers import features as fe

model = hc.HierarchicalClassifierModel(
    model_name='my_model',
    experiment_name='experiment_1',
    input_features=[
        fe.InputFeature(feature_name='Style',    feature_type='text'),
        fe.InputFeature(feature_name='Gender',   feature_type='text'),
        fe.InputFeature(feature_name='Brand',    feature_type='text'),
        fe.InputFeature(feature_name='Price',    feature_type='numeric'),
        fe.InputFeature(feature_name='Category', feature_type='category'),
    ],
    output_feature_hierarchy=fe.OutputFeature(
        feature_name='Division',
        child_feature=fe.OutputFeature(feature_name='Class')),
    text_model_name='all-MiniLM-L6-v2')  # or 'all-mpnet-base-v2' for higher quality

model.load_from_dataframe(data=df)
model.save_model(filepath='my_model.pwml')

The model trains n+1 classifiers where n is the number of distinct Division values: one classifier for the top-level Division prediction, and one per Division value for the Class prediction within that division.

Single-sample inference

model = hc.HierarchicalClassifierModel.load_model(filepath='my_model.pwml')

result = model.predict(
    input={'Style': 'slim fit jeans', 'Gender': 'men', 'Brand': 'Acme', 'Price': 49.99, 'Category': 'Bottoms'},
    min_routing_confidence=0.6)

# result is a list of dicts, one per hierarchy level:
# [{'feature_name': 'Division', 'value': 'Apparel', 'confidence': 0.91},
#  {'feature_name': 'Class',    'value': 'Denim',   'confidence': 0.78}]

Batch inference

predictions_df = model.predict_dataframe(data=df)
# Returns df with extra columns: Division_predicted, Division_confidence, Class_predicted, Class_confidence

# With per-node latency profiling
predictions_df, latency = model.predict_dataframe(data=df, profile=True)
# latency: {'Division': 0.0012, 'Division/Apparel': 0.0009, ...}

Evaluation and cross-validation

metrics, predictions_df = model.evaluate(data=df)

summary, per_fold = model.cross_validate(data=df, n_splits=5, search_n_jobs=4)
print(summary)  # {'Division': {'mean': 0.87, 'std': 0.02}, 'Class': {'mean': 0.74, 'std': 0.04}}

`timeseries` - Time Series Utilities

Data augmentation

from pwml.timeseries import dataaugmentationhelpers as dah

# Split data before calling prepare_data to avoid scaler leakage
train_df = df.iloc[:split]
test_df  = df.iloc[split:]

X_train, y_train, index, scaler_in, scaler_out, n_samples = dah.prepare_data(
    data=train_df,
    lags_in=[1, 7],
    cols_in=['feature_a', 'feature_b'],
    steps_in=14,
    cols_out=['target'],
    steps_out=7,
    augmentation_factor=3,
    noise_std=0.05)

# Pass pre-fit scalers for the test set to prevent leakage
X_test, y_test, _, _, _, _ = dah.prepare_data(
    data=test_df,
    lags_in=[1, 7],
    cols_in=['feature_a', 'feature_b'],
    steps_in=14,
    cols_out=['target'],
    steps_out=7,
    scaler_in=scaler_in,
    scaler_out=scaler_out)

Prophet helpers

from pwml.timeseries import prophethelpers as ph

# Summarise regressor coefficients for a fitted Prophet model
coefs_df = ph.regressor_coefficients(m)

# Plot regressor importance (beta coefficients)
ph.plot_regressors_importance(m, title='Regressor importance')

Visualization

from pwml.timeseries import visualizationhelpers as vh

vh.plot_time_series(
    title='Forecast',
    training=train_df,
    testing=test_df,
    prediction=forecast_df,
    confidence=forecast_df)

vh.plot_time_series_dist(data=residuals, title='Residual distribution')

vh.plot_seasonal_decomposition(data=series, period=52)

vh.plot_autocorrelation(data=series, lags=50)

`utilities`

Module	Purpose
`graphichelpers`	`GraphicsStatics`: matplotlib/seaborn style initialization, color/linestyle palette, `style_plot`
`mssqlhelpers`	`execute(proc_name, conn_params, proc_params, commit=True)` - call a stored procedure, returns a DataFrame
`neptunehelpers`	`ExperimentTracker` protocol + `NeptuneExperimentManager` - vendor-neutral experiment tracking (Neptune adapter requires `pip install pwml[neptune]`)
`driftmonitor`	`DriftMonitor` - compute PSI and Jensen-Shannon divergence between reference and live distributions; integrates with any `ExperimentTracker`
`httphelpers`	Image download utilities
`imagehelpers`	PIL image helpers (resize, crop, batch conversion)
`filehelpers`	Pickle serialization helpers
`classificationhelpers`	`MulticlassClassifierOptimizer` - Platt calibration + per-class threshold tuning
`commonhelpers`	Miscellaneous utilities

`examples` - Runnable Examples

Model Hosting (`examples/modelhosting.py`)

A Flask REST API that serves one or more pre-trained HierarchicalClassifierModel instances.

python examples/modelhosting.py \
    --host 0.0.0.0 \
    --port 5000 \
    --models "v1/division|/path/to/model.pwml"

Each loaded model is exposed at /api/<model-id> (POST). For production, use a WSGI server such as gunicorn:

gunicorn -w 4 -b 0.0.0.0:5000 "modelhosting:Statics.g_app"

Streamlit Web App (`examples/webapp/app.py`)

An interactive demo app covering data exploration, batch predictions with confidence heatmaps, per-level accuracy charts, per-node latency profiling, and concept drift monitoring with PSI gauges.

pip install streamlit
streamlit run examples/webapp/app.py

Experiment tracking

from pwml.utilities import neptunehelpers as nh

with nh.NeptuneExperimentManager(
        log=True,
        project_name='workspace/project',
        experiment_name='run_001',
        experiment_params={'lr': 0.01, 'epochs': 100},
        experiment_tags=['baseline']) as em:

    em.set_experiment_property('dataset_version', 'v3')
    em.log_data(data=results_df, name='results')
    em.log_chart(figure=fig, name='loss_curve')

Requires neptune >= 1.0 (pip install pwml[neptune]). Set the NEPTUNE_API_TOKEN environment variable before running.

VS Code Tasks

The project includes pre-configured VS Code tasks (.vscode/tasks.json) for common development workflows. Run them via Terminal > Run Task.

Task	Description	Port
Jupyter: Start Lab Server	Starts a token-free JupyterLab server	8888
Jupyter: Start Notebook Server	Starts a classic Jupyter Notebook server (via nbclassic)	8889
Streamlit: Start Demo App	Launches the interactive pwml demo web application	8501
Test: Run All Notebooks	Runs all example notebooks as tests via nbmake	-
Test: Run Notebook (prompt)	Runs a single notebook by name	-

When using the devcontainer, ports 8501, 8888, and 8889 are automatically forwarded to the host.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.0

Mar 25, 2026

1.2.1

Mar 24, 2026

1.2.0

Mar 24, 2026

1.1.0

Mar 24, 2026

1.0.1

Mar 24, 2026

1.0.0

Mar 24, 2026

0.9.9

Jan 5, 2021

0.9.8

Dec 16, 2020

0.9.7

Sep 25, 2020

0.9.6

Sep 25, 2020

0.9.5

Sep 25, 2020

0.9.4

Sep 25, 2020

0.9.3

Jul 30, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pwml-2.0.0.tar.gz (63.8 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pwml-2.0.0-py3-none-any.whl (68.3 kB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file pwml-2.0.0.tar.gz.

File metadata

Download URL: pwml-2.0.0.tar.gz
Upload date: Mar 25, 2026
Size: 63.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for pwml-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`cd0be2f4c43d5a19778f57ff7d713eaf50e75e3566c025a7faf06d2005bf5b6e`
MD5	`b698a487c8742b4b9216e346dbcd749f`
BLAKE2b-256	`39746848f1da2ab81e2352d217a5490160ef1d519284de08b6b1d5181cba31f2`

See more details on using hashes here.

File details

Details for the file pwml-2.0.0-py3-none-any.whl.

File metadata

Download URL: pwml-2.0.0-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 68.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for pwml-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`331a8d132159df36347c771a11881a4229832b9362a5eb57518b270ca6a9e66f`
MD5	`21d4ebde56752d348d4e6e4d7d5701ef`
BLAKE2b-256	`f270c4a6205da5054cc8606c7343373e5658f8fd10ed82371cdf85e18c3ac21c`

See more details on using hashes here.

pwml 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pwml

Requirements

Installation

Modules

classifiers - Hierarchical Classification

Training

Single-sample inference

Batch inference

Evaluation and cross-validation

timeseries - Time Series Utilities

Data augmentation

Prophet helpers

Visualization

utilities

examples - Runnable Examples

Model Hosting (examples/modelhosting.py)

Streamlit Web App (examples/webapp/app.py)

Experiment tracking

VS Code Tasks

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`classifiers` - Hierarchical Classification

`timeseries` - Time Series Utilities

`utilities`

`examples` - Runnable Examples

Model Hosting (`examples/modelhosting.py`)

Streamlit Web App (`examples/webapp/app.py`)