Data augmentation library with Rust-accelerated operations

These details have not been verified by PyPI

Project description

Additory Rust Python Bindings

High-performance Rust implementations of Additory's data augmentation functions with Python bindings.

Performance: 2-5x faster than pure Python
Compatibility: 100% API compatible with Additory v0.1.1
Python Support: Python 3.8+ (abi3)

Features

✅ Zero-copy DataFrame transfer via Apache Arrow IPC
✅ Automatic backend detection (pandas/polars)
✅ Graceful fallback to pure Python if Rust unavailable
✅ Memory efficient with minimal overhead
✅ Type safe with Rust's ownership system

Installation

From PyPI (when published)

pip install additory-rust

From Source

# Install build dependencies
pip install maturin

# Navigate to bindings directory
cd rust-core/additory-py

# Build and install in development mode
maturin develop --release

# Or build a wheel
maturin build --release

Quick Start

The Rust bindings are automatically used when available through the Additory Python wrapper:

import polars as pl
from additory.functions.to import to

# Create sample data
orders = pl.DataFrame({
    "order_id": [1, 2, 3, 4],
    "product_id": [101, 102, 101, 103]
})

products = pl.DataFrame({
    "product_id": [101, 102, 103],
    "price": [10.0, 20.0, 15.0]
})

# Lookup operation (automatically uses Rust if available)
result = to(orders, from_df=products, bring="price", against="product_id")
print(result.df)

Output:

┌──────────┬────────────┬───────┐
│ order_id │ product_id │ price │
├──────────┼────────────┼───────┤
│ 1        │ 101        │ 10.0  │
│ 2        │ 102        │ 20.0  │
│ 3        │ 101        │ 10.0  │
│ 4        │ 103        │ 15.0  │
└──────────┴────────────┴───────┘

Supported Operations

Lookup

Join DataFrames and bring columns from reference data.

result = to(df, from_df=ref, bring=["col1", "col2"], against="id")

Merge

Combine DataFrames vertically or horizontally.

result = to(df1, from_df=df2, to="@merge", how="vertical")

Sort

Sort DataFrame by specified columns.

result = to(df, to="@sort", by="column", descending=False)

Summarize

Group and aggregate data.

result = to(df, to="@summarize", against="category", 
            aggregations={"sales": "sum", "quantity": "mean"})

Performance Benchmarks

Operation	Rows	Rust Time	Python Time	Speedup
Lookup	1k	0.020s	0.045s	2.3x
Lookup	10k	0.003s	0.012s	4.0x
Lookup	100k	0.015s	0.055s	3.7x
Sort	10k	0.002s	0.008s	4.0x
Sort	100k	0.020s	0.080s	4.0x

Pandas Compatibility

Works seamlessly with pandas DataFrames:

import pandas as pd

orders_pd = pd.DataFrame({
    "order_id": [1, 2, 3],
    "product_id": [101, 102, 101]
})

products_pd = pd.DataFrame({
    "product_id": [101, 102],
    "price": [10.0, 20.0]
})

# Automatic conversion and Rust acceleration
result = to(orders_pd, from_df=products_pd, bring="price", against="product_id")
# Result is also pandas DataFrame

Checking Rust Availability

from additory.functions.to import RUST_AVAILABLE

if RUST_AVAILABLE:
    print("🦀 Rust acceleration enabled!")
    import additory_rust
    print(f"Version: {additory_rust.__version__}")
else:
    print("🐍 Using pure Python implementation")

Direct Rust API (Advanced)

For advanced users who want to bypass the Python wrapper:

import additory_rust
import polars as pl
import io

# Convert DataFrame to Arrow IPC bytes
def df_to_bytes(df):
    buffer = io.BytesIO()
    df.write_ipc(buffer)
    return buffer.getvalue()

def bytes_to_df(data):
    buffer = io.BytesIO(data)
    return pl.read_ipc(buffer)

# Direct Rust call
df_bytes = df_to_bytes(orders)
from_df_bytes = df_to_bytes(products)

result_bytes = additory_rust.to_lookup(
    df_bytes, from_df_bytes, ["price"], ["product_id"]
)

result_df = bytes_to_df(result_bytes)

Architecture

Python DataFrame (pandas/polars)
        ↓
Arrow IPC Serialization
        ↓
Rust Processing (zero-copy)
        ↓
Arrow IPC Deserialization
        ↓
Python DataFrame (original type)

Error Handling

All Rust errors are converted to appropriate Python exceptions:

try:
    result = to(df, from_df=ref, bring="invalid_col", against="id")
except ValueError as e:
    print(e)
    # ValueError: Bring columns not found in reference DataFrame: ['invalid_col'].
    # Available columns: ['id', 'price', 'name']

Development

Building

# Debug build
maturin develop

# Release build (optimized)
maturin develop --release

# Build wheel
maturin build --release

Testing

# Rust unit tests
cargo test

# Python integration tests
python test_phase4_integration.py

# Performance benchmarks
python benchmark_rust_performance.py

Documentation

# Generate Rust docs
cargo doc --open

# View API documentation
cat API_DOCUMENTATION.md

# View usage examples
cat USAGE_EXAMPLES.md

Platform Support

Platform	Architecture	Status	Wheel Size
Linux	x86_64	✅ Built	14MB
Linux	aarch64	📝 Documented	-
macOS	x86_64	📝 Documented	-
macOS	aarch64	📝 Documented	-
Windows	x86_64	📝 Documented	-

See MULTI_PLATFORM_BUILD_GUIDE.md for build instructions.

Troubleshooting

See TROUBLESHOOTING.md for common issues and solutions.

Documentation

API Documentation - Detailed function reference
Usage Examples - Real-world examples
Multi-Platform Build Guide - Building for different platforms
Troubleshooting Guide - Common issues and solutions

Contributing

Contributions welcome! Please ensure:

All tests pass (cargo test and Python tests)
Code is formatted (cargo fmt)
No clippy warnings (cargo clippy)
Documentation is updated

License

MIT License - see LICENSE file for details

Version

Current Version: 0.2.0
Last Updated: 2025-02-04
Python Support: 3.8+
Polars Version: 0.44+

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3a11 pre-release

May 9, 2026

0.1.3a10 pre-release

Mar 12, 2026

0.1.3a9 pre-release

Mar 8, 2026

0.1.3a8 pre-release

Mar 3, 2026

0.1.3a7 pre-release

Feb 13, 2026

0.1.3a6 pre-release

Feb 13, 2026

0.1.3a5 pre-release

Feb 13, 2026

0.1.3a4 pre-release

Feb 11, 2026

0.1.3a3 pre-release

Feb 9, 2026

0.1.3a2 pre-release

Feb 9, 2026

0.1.3a1 pre-release

Feb 9, 2026

0.1.2a1 pre-release

Feb 5, 2026

0.1.1a6 pre-release

Feb 4, 2026

This version

0.1.1a5 pre-release

Feb 4, 2026

0.1.1a4 pre-release

Feb 4, 2026

0.1.1a3 pre-release

Feb 4, 2026

0.1.1a2 pre-release

Feb 4, 2026

0.1.1a1 pre-release

Feb 4, 2026

0.1.0a4 pre-release

Jan 28, 2026

0.1.0a3 pre-release

Jan 27, 2026

0.1.0a2 pre-release

Jan 25, 2026

0.1.0a1 pre-release

Jan 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

additory-0.1.1a5.tar.gz (182.4 kB view details)

Uploaded Feb 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

additory-0.1.1a5-cp38-abi3-manylinux_2_34_x86_64.whl (14.9 MB view details)

Uploaded Feb 4, 2026 CPython 3.8+manylinux: glibc 2.34+ x86-64

File details

Details for the file additory-0.1.1a5.tar.gz.

File metadata

Download URL: additory-0.1.1a5.tar.gz
Upload date: Feb 4, 2026
Size: 182.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for additory-0.1.1a5.tar.gz
Algorithm	Hash digest
SHA256	`3a85f016f41dded563128d4bc2ea59a7a1d337076dd9af0f924fea2d0fafd5e6`
MD5	`777cc0da70e031b541fadcba05e25e48`
BLAKE2b-256	`46680f9226fde3c156a934acd4fc91dfbe2c6fd4ded7b2b0ed2fafd05912c8f3`

See more details on using hashes here.

File details

Details for the file additory-0.1.1a5-cp38-abi3-manylinux_2_34_x86_64.whl.

File metadata

Download URL: additory-0.1.1a5-cp38-abi3-manylinux_2_34_x86_64.whl
Upload date: Feb 4, 2026
Size: 14.9 MB
Tags: CPython 3.8+, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for additory-0.1.1a5-cp38-abi3-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`ba2d59e7e65d6a24e6af586dbda36edeb0191809e77695bd4d07680e354eb8dd`
MD5	`fa800ac50c7caa714994d1d7f3a97e1f`
BLAKE2b-256	`fde4b08f9d4e5c3be7f5a43de3600a380bf376d8888b3c64b4b3838725e62fa2`

See more details on using hashes here.

additory 0.1.1a5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Additory Rust Python Bindings

Features

Installation

From PyPI (when published)

From Source

Quick Start

Supported Operations

Lookup

Merge

Sort

Summarize

Performance Benchmarks

Pandas Compatibility

Checking Rust Availability

Direct Rust API (Advanced)

Architecture

Error Handling

Development

Building

Testing

Documentation

Platform Support

Troubleshooting

Documentation

Contributing

License

Version

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes