Skip to main content

skrub with a Rust backend

Project description

Stratum

Stratum is an experimental fork of skrub with a Rust backend for compute-heavy operations, while keeping the high-level Python API intact.


Goals

  • Provide an opt-in Rust backend for performance-critical parts of skrub.
  • Preserve skrub’s Python API; developers can flip a flag to enable Rust.
  • Build cross-platform wheels (Windows / Linux / macOS) so users can install without a Rust toolchain.

Installation

For now, you need to build from source.

Requirements:


Usage

To enable the Rust backend and other related features, import stratum as skrub and enable the Rust backend:

Replace

import skrub
from skrub import ...

with

import stratum
from stratum import ...
skrub.set_config(rust_backend=True)

Test Code

import os
import pandas as pd
import stratum as skrub
from stratum import StringEncoder
skrub.set_config(rust_backend=True)

#skrub.set_config(debug_timing=True, num_threads=0) # other rust flags
s = pd.Series(["foo", "bar", None, "lorem ipsum dolor"])
enc = StringEncoder(vectorizer='hashing', analyzer='char', ngram_range=(3,5), n_components=2)
Z = enc.fit_transform(s)
print(type(Z), Z.shape)
assert Z.shape[0] == len(s)

Repository Layout

stratum/
├─ pyproject.toml             # Python + Rust build config (maturin)
├─ stratum/
│ ├─ __init__.py              # Façade over skrub ├─ config.py                # set_config/get_config + env sync ├─ _rust_backend.py         # Python <-> Rust shim (re-exports native fns) ├─ adapters/                # Public API (dispatch to Rust or fallback to skrub)  └─ string_encoder.py      # RustyStringEncoder (subclass) └─ _rust_backend_native.*   # Compiled PyO3 extension (built)
└─ _rust/                     # Rust crate (PyO3 extension)
├─ Cargo.toml
└─ src/lib.rs                 # Defines #[pymodule] fn _rust_backend_native(...)

Developer Instructions

Local Dev Install (Editable)

maturin develop				# Debug mode
maturin develop --release	# Optimized dev build

Building Wheels

This produces redistributable .whl files under dist/.

maturin build --release -o dist --interpreter python3.10 --compatibility linux		# Linux/macOS
maturin build --release -o dist		# Windows

Then install with:

pip install ./dist/stratum-*.whl

License

BSD-3-Clause (inherited from skrub).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stratum_ai-0.0.0.dev0.tar.gz (30.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stratum_ai-0.0.0.dev0-cp310-abi3-win_amd64.whl (16.3 MB view details)

Uploaded CPython 3.10+Windows x86-64

File details

Details for the file stratum_ai-0.0.0.dev0.tar.gz.

File metadata

  • Download URL: stratum_ai-0.0.0.dev0.tar.gz
  • Upload date:
  • Size: 30.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for stratum_ai-0.0.0.dev0.tar.gz
Algorithm Hash digest
SHA256 ffc9fad7f543b7f6eac5a0c3daa225013767f4d51db2b8f618261cf75811839c
MD5 97f0acbc393ba1c552b14efa67fbb1a6
BLAKE2b-256 1eb97331a43f82f43c8d53c1e0af11be6b292192dcf7c7ade3528bc10cee0dc7

See more details on using hashes here.

File details

Details for the file stratum_ai-0.0.0.dev0-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for stratum_ai-0.0.0.dev0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 66c4509ed61e501ef375b7d4b2fdae4cb2bc5417932ad9b14f10bf93eb687c29
MD5 fe478d043ccd499dbc7376ff3151376f
BLAKE2b-256 251e32f104789da92bb5635f4a8361f3233ee2a04efad1c6f10216e4b3d5cd94

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page