skrub with a Rust backend
Project description
Stratum
Stratum is an experimental fork of skrub with a Rust backend for compute-heavy operations, while keeping the high-level Python API intact.
Goals
- Provide an opt-in Rust backend for performance-critical parts of skrub.
- Preserve skrub’s Python API; developers can flip a flag to enable Rust.
- Build cross-platform wheels (Windows / Linux / macOS) so users can install without a Rust toolchain.
Installation
For now, you need to build from source.
Requirements:
- Python 3.10+
- Rust toolchain (nightly not required; stable is fine)
- maturin (
pip install maturin)
Usage
To enable the Rust backend and other related features, import stratum as skrub and enable the Rust backend:
Replace
import skrub
from skrub import ...
with
import stratum
from stratum import ...
skrub.set_config(rust_backend=True)
Test Code
import os
import pandas as pd
import stratum as skrub
from stratum import StringEncoder
skrub.set_config(rust_backend=True)
#skrub.set_config(debug_timing=True, num_threads=0) # other rust flags
s = pd.Series(["foo", "bar", None, "lorem ipsum dolor"])
enc = StringEncoder(vectorizer='hashing', analyzer='char', ngram_range=(3,5), n_components=2)
Z = enc.fit_transform(s)
print(type(Z), Z.shape)
assert Z.shape[0] == len(s)
Repository Layout
stratum/
├─ pyproject.toml # Python + Rust build config (maturin)
├─ stratum/
│ ├─ __init__.py # Façade over skrub
│ ├─ config.py # set_config/get_config + env sync
│ ├─ _rust_backend.py # Python <-> Rust shim (re-exports native fns)
│ ├─ adapters/ # Public API (dispatch to Rust or fallback to skrub)
│ │ └─ string_encoder.py # RustyStringEncoder (subclass)
│ └─ _rust_backend_native.* # Compiled PyO3 extension (built)
└─ _rust/ # Rust crate (PyO3 extension)
├─ Cargo.toml
└─ src/lib.rs # Defines #[pymodule] fn _rust_backend_native(...)
Developer Instructions
Local Dev Install (Editable)
maturin develop # Debug mode
maturin develop --release # Optimized dev build
Building Wheels
This produces redistributable .whl files under dist/.
maturin build --release -o dist --interpreter python3.10 --compatibility linux # Linux/macOS
maturin build --release -o dist # Windows
Then install with:
pip install ./dist/stratum-*.whl
License
BSD-3-Clause (inherited from skrub).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stratum_ai-0.0.0.dev0.tar.gz.
File metadata
- Download URL: stratum_ai-0.0.0.dev0.tar.gz
- Upload date:
- Size: 30.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffc9fad7f543b7f6eac5a0c3daa225013767f4d51db2b8f618261cf75811839c
|
|
| MD5 |
97f0acbc393ba1c552b14efa67fbb1a6
|
|
| BLAKE2b-256 |
1eb97331a43f82f43c8d53c1e0af11be6b292192dcf7c7ade3528bc10cee0dc7
|
File details
Details for the file stratum_ai-0.0.0.dev0-cp310-abi3-win_amd64.whl.
File metadata
- Download URL: stratum_ai-0.0.0.dev0-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 16.3 MB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66c4509ed61e501ef375b7d4b2fdae4cb2bc5417932ad9b14f10bf93eb687c29
|
|
| MD5 |
fe478d043ccd499dbc7376ff3151376f
|
|
| BLAKE2b-256 |
251e32f104789da92bb5635f4a8361f3233ee2a04efad1c6f10216e4b3d5cd94
|