Skip to main content

High-performance Rust extensions for Axolotl - drop-in acceleration for existing installations

Project description

Fast-Axolotl

CI PyPI Python License

High-performance Rust extensions for Axolotl - drop-in acceleration for existing installations.

Highlights

  • Zero-config acceleration - Just import fast_axolotl before axolotl
  • 77x faster streaming - Rust-based data loading vs HuggingFace datasets
  • Parallel hashing - Multi-threaded SHA256 for deduplication
  • Cross-platform - Linux, macOS, Windows with Python 3.10-3.12

Quick Start

pip install fast-axolotl
import fast_axolotl  # Auto-installs acceleration shim

# Now use axolotl normally - accelerations are active
import axolotl

Benchmark Results

Tested on Linux x86_64, Python 3.11, 16 CPU cores:

Operation Data Size Rust Python Speedup
Streaming Data Loading 50,000 rows 0.009s 0.724s 77x
Parallel Hashing (SHA256) 100,000 rows 0.027s 0.052s 1.9x
Token Packing 10,000 sequences 0.079s 0.033s 0.4x*
Batch Padding 10,000 sequences 0.200s 0.105s 0.5x*

*Token packing and batch padding show overhead for small datasets due to FFI costs. Performance gains are realized with larger datasets typical in LLM training.

See BENCHMARK.md for detailed results.

Compatibility

All features tested and working:

Feature Status
Rust Extension Loading Tested
Module Shimming Tested
Streaming (Parquet, JSON, CSV, Arrow) Tested
Token Packing Tested
Parallel Hashing Tested
Batch Padding Tested
Axolotl Integration Tested

See COMPATIBILITY.md for full test results.

Features

1. Streaming Data Loading

Memory-efficient streaming for large datasets:

from fast_axolotl import streaming_dataset_reader

for batch in streaming_dataset_reader(
    "/path/to/large_dataset.parquet",
    dataset_type="parquet",
    batch_size=1000,
    num_threads=4
):
    process(batch)

Supports: Parquet, Arrow, JSON, JSONL, CSV, Text (with ZSTD/Gzip compression)

2. Token Packing

Replace inefficient torch.cat() loops:

from fast_axolotl import pack_sequences

result = pack_sequences(
    sequences=[[1, 2, 3], [4, 5], [6, 7, 8, 9]],
    max_length=2048,
    pad_token_id=0,
    eos_token_id=2
)
# Returns: {'input_ids': [...], 'labels': [...], 'attention_mask': [...]}

3. Parallel Hashing

Multi-threaded SHA256 for deduplication:

from fast_axolotl import parallel_hash_rows, deduplicate_indices

hashes = parallel_hash_rows(rows, num_threads=0)  # 0 = auto

# Or get unique indices directly
unique_indices, new_hashes = deduplicate_indices(rows)

4. Batch Padding

Efficient sequence padding:

from fast_axolotl import pad_sequences

padded = pad_sequences(
    [[1, 2, 3], [4, 5]],
    target_length=8,
    pad_value=0,
    padding_side="right"
)

Installation

From PyPI

pip install fast-axolotl

From Source

git clone https://github.com/axolotl-ai-cloud/fast-axolotl
cd fast-axolotl

# Using uv (recommended)
uv pip install -e .

# Or with pip + maturin
pip install maturin
maturin develop --release

Documentation

Configuration

Enable features in your Axolotl config:

# Enable Rust streaming for large datasets
dataset_use_rust_streaming: true
sequence_len: 32768

# Deduplication uses parallel hashing automatically
dedupe: true

Development

git clone https://github.com/axolotl-ai-cloud/fast-axolotl
cd fast-axolotl

uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"
maturin develop

# Run tests
pytest -v

# Run benchmarks
python scripts/benchmark.py

# Run compatibility tests
python scripts/compatibility_test.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_axolotl-0.1.0.tar.gz (177.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fast_axolotl-0.1.0-cp313-cp313-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.13Windows x86-64

fast_axolotl-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

fast_axolotl-0.1.0-cp313-cp313-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

fast_axolotl-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

fast_axolotl-0.1.0-cp312-cp312-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.12Windows x86-64

fast_axolotl-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

fast_axolotl-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

fast_axolotl-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

fast_axolotl-0.1.0-cp311-cp311-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.11Windows x86-64

fast_axolotl-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

fast_axolotl-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

fast_axolotl-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

fast_axolotl-0.1.0-cp310-cp310-win_amd64.whl (3.6 MB view details)

Uploaded CPython 3.10Windows x86-64

fast_axolotl-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

fast_axolotl-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

fast_axolotl-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file fast_axolotl-0.1.0.tar.gz.

File metadata

  • Download URL: fast_axolotl-0.1.0.tar.gz
  • Upload date:
  • Size: 177.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fast_axolotl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 73ecae00859469558a02747b44adbfe0288437c24eef47f49ac3238047a0a90f
MD5 1761d27fdc47a590c1bb1e063d957245
BLAKE2b-256 491112f6473ac98d64a28ffef7f9b72f4cbe2eadc18d7cd6397906b7f11b96f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0.tar.gz:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ba54df850397a6ccc84dfb441057df62c28a5a0e2614fa91cff6af2d11ac5271
MD5 338af66d51ccd91f25bcdc9caed10e69
BLAKE2b-256 16307354af3e1ed1f49241c404b97284e5705458fc75ffbcfc7a7c33315cf39f

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cdd91827f764341c20dfa37efb77f151db3942e4d1ae5433a6881bcaaf51334b
MD5 2ed957130743f662118efd2f802309d1
BLAKE2b-256 016d5c72f6ebce48eb1ac81bb769d88e8b15a81027c8a5faf4e108c94dad5c77

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b73dd34d392f7570af12ca5fbdb32bc811e4f46de6298a35105c4cafddd323a5
MD5 9c9dcf99fde39c60f11f3e0e39c2a9a6
BLAKE2b-256 7afa466a18032d75a089fe421e90e964e1bf8e0c7d91c4d70893bab0ef65680b

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ace2cc84b755cc6361d2ef9e2797fc8803b2d2e6dae59543331963346d549606
MD5 47138d55fb1971de3a90e94769742271
BLAKE2b-256 1075c88d90c8dbf3c6372ef0da7f8ee1fb1796571dec60f74a421661b401375e

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 01db4aaedc274f3729655e088d4cad127f9237f3d4c6dd6a3ab19b13c26f01bd
MD5 ce3ffd6b4ad5533be21ac3bd33a4c37b
BLAKE2b-256 0893541ff636a091613f2cb1297256a65153272ab1c41840faa3319765cf57a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b95bf6c2db64b0cd3b08344fe5041a4d1823728a5b7987084e07179b94055988
MD5 79965a23e5e340bc9ca99222a1949f79
BLAKE2b-256 2b80de0eda5200b27da8ac9628e2b2152a5aac08585168e5316084f61c9712ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 05c87d43914795c72bd4c196e06d965806ec82bb78e1ce006febbb18c7d7e019
MD5 34fe4d946feb540dd8348b0692575e95
BLAKE2b-256 5f960fd9cad837ffb9757a8c673cd5b34400b46bc8bdc555fc619b6a454bd0af

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4a9d338c10ca4ccbab92be2af3668936dcd7d483a3006ee8f70699fab7b83862
MD5 8da211a7f26bbc4c7029e06eb93475d1
BLAKE2b-256 56071603288c381c393a6042cca8ce8ae7ee93f6d38d0f80135e091516f96a1d

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a4cbec477e7c551fefa1c4b346acd95dde8fa22ef6c09904087d83a11db81f00
MD5 67bbd6433957d335aa62c72fff6d5e33
BLAKE2b-256 539916d4f1ac68017feac2cd768ae036a97874026fca0f6274e58cf3bb1eb383

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2081b0d0cee794a8234fe0df3a545eae8ea5fda61f6f47d213e851898c204bf4
MD5 8f7b0f39aa4ad51762d52d704e27bd42
BLAKE2b-256 79b1db48bcfefb8fc1f3981edfb415f83bafbb178f801f40a72a28f94bc19255

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a70428958867f09ee8918a076f23ef3471eccd0d61561d28965e153ab4a9391f
MD5 faa7eedbfe83826c55023487028d9850
BLAKE2b-256 ec09abca9632c09e92043f1bcf354306716cce8d99c2b86492f2dde31c6b0c20

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4f0b4bf3e933a20dab8ff68158b2a2f36b1dd181474d9abef9ac99f5f37d69ae
MD5 b5fa47fd6e4fa792e064e3f781efc314
BLAKE2b-256 64381a3117ba305ec90975fe14476bbb68f6337b3f714632c0ca7b1402ecb7bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 0acbc4a63d8b515121f84e89c46ae65c86596a6ba86814babc2a75493e26f7c4
MD5 eeb54900da283158bfd1234845902973
BLAKE2b-256 c55a7f7f57af4c91de440e031055b120be19de5918def90655acc77c965f9b13

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 74f7676872d6e05e74eec2776460c0c56a7d99f28fa236791d901b6825ffc7bb
MD5 481ef81a806510ecbab41a5d97c23ed8
BLAKE2b-256 6e6042dded827eff7ac5d739fa175fc421a94e9873cfd993c75e8495870ba31e

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 65980e69fb788eeb056b3d180cbc1cd81bb3a16c058c4429ca44fe0cf34584b2
MD5 831fadad7616f2815db441ed851422c7
BLAKE2b-256 dae07471ba5ea87c34e694900be97f0d10e75e8a310b7a94f29b3ea3b75cf0c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fast_axolotl-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fast_axolotl-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 87878296764bacc39ee8e6930471e7aef6bd13b4e4f25b32479bfeef23622f40
MD5 66d55c57bf8be8cc798391e80acdd5ee
BLAKE2b-256 c2f3e7f8d98f165efbc23eea71464e8ce0e29f2428f1b059c81a9b902cd7a67d

See more details on using hashes here.

Provenance

The following attestation bundles were made for fast_axolotl-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl:

Publisher: publish.yml on neul-labs/fast-axolotl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page