Skip to main content

High-performance transformer library for Python, written in Rust

Project description

TrustformeRS Python

High-performance transformer library for Python, written in Rust. Drop-in replacement for Hugging Face Transformers with significant performance improvements.

Features

  • 🚀 10-100x faster than pure Python implementations
  • 🔄 Drop-in replacement for Hugging Face Transformers
  • 🦀 Written in Rust for memory safety and performance
  • 🔧 Zero-copy tensor operations with NumPy
  • 🤝 PyTorch interoperability (optional)
  • 📦 No external dependencies for core functionality

Installation

pip install trustformers

From source

# Install maturin (build tool for Rust Python extensions)
pip install maturin

# Clone the repository
git clone https://github.com/cool-japan/trustformers
cd trustformers/trustformers-py

# Build and install
maturin develop --release

Quick Start

Basic Usage

from trustformers import AutoModel, AutoTokenizer, pipeline

# Load model and tokenizer
model = AutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# Create a pipeline
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Run inference
results = classifier("This is a great library!")
print(results)

Direct Model Usage

import numpy as np
from trustformers import BertModel, Tensor

# Create model
model = BertModel.from_pretrained("bert-base-uncased")

# Create input tensors
input_ids = Tensor(np.array([[101, 2023, 2003, 1037, 2742, 102]]))
attention_mask = Tensor(np.ones((1, 6)))

# Forward pass
outputs = model(input_ids, attention_mask)
print(outputs["last_hidden_state"].shape)

NumPy Integration

import numpy as np
from trustformers import Tensor

# Create tensor from NumPy array
np_array = np.random.randn(2, 3, 4).astype(np.float32)
tensor = Tensor(np_array)

# Convert back to NumPy
np_array_back = tensor.numpy()

# Tensor operations
result = tensor.matmul(tensor.transpose())

PyTorch Interoperability

import torch
from trustformers import Tensor

# Convert from PyTorch
torch_tensor = torch.randn(2, 3, 4)
trust_tensor = Tensor.from_torch(torch_tensor)

# Convert to PyTorch
torch_tensor_back = trust_tensor.to_torch()

Supported Models

  • BERT and variants (RoBERTa, ALBERT, DistilBERT, ELECTRA, DeBERTa)
  • GPT-2 and variants (GPT-Neo, GPT-J)
  • T5 (encoder-decoder)
  • LLaMA and Mistral
  • Vision Transformer (ViT)
  • CLIP (multimodal)

API Compatibility

TrustformeRS provides a compatible API with Hugging Face Transformers:

# Hugging Face Transformers
from transformers import AutoModel, AutoTokenizer

# TrustformeRS (drop-in replacement)
from trustformers import AutoModel, AutoTokenizer

Most code written for Hugging Face Transformers will work with minimal changes.

Performance

Benchmarks on common tasks:

Task Model HF Transformers TrustformeRS Speedup
Text Classification BERT-base 52 ms 3.2 ms 16.3x
Text Generation GPT-2 124 ms 8.7 ms 14.3x
Question Answering BERT-large 89 ms 5.4 ms 16.5x

Benchmarks run on Apple M1 Pro, batch size 1, sequence length 512

Advanced Features

Custom Models

from trustformers import PreTrainedModel, Tensor
import numpy as np

class CustomModel(PreTrainedModel):
    def __init__(self, config):
        super().__init__(config)
        # Define your model architecture
    
    def forward(self, input_ids, attention_mask=None):
        # Implement forward pass
        pass

Training (Coming Soon)

from trustformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    learning_rate=5e-5,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

trainer.train()

Development

Building from source

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black .
isort .

# Lint
ruff check .

Architecture

The library is organized into several components:

  • tensor.rs - Tensor operations and NumPy integration
  • models.rs - Model implementations (BERT, GPT-2, etc.)
  • tokenizers.rs - Tokenizer implementations
  • pipelines.rs - High-level pipeline API
  • auto.rs - Auto classes for model/tokenizer loading
  • training.rs - Training utilities (WIP)

License

Apache License 2.0

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

Citation

If you use TrustformeRS in your research, please cite:

@software{trustformers2024,
  title = {TrustformeRS: High-Performance Transformers in Rust},
  year = {2024},
  url = {https://github.com/cool-japan/trustformers}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trustformers-0.1.0a1.tar.gz (4.5 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

trustformers-0.1.0a1-cp38-abi3-manylinux_2_34_x86_64.whl (12.8 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.34+ x86-64

trustformers-0.1.0a1-cp38-abi3-macosx_11_0_arm64.whl (945.7 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file trustformers-0.1.0a1.tar.gz.

File metadata

  • Download URL: trustformers-0.1.0a1.tar.gz
  • Upload date:
  • Size: 4.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.1

File hashes

Hashes for trustformers-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 7da87359e0d8f2cab304930dc0893b550e6841eefc9174be2f58551fad75399b
MD5 9d8f8b71a0cc6ae83209b2e7943e7764
BLAKE2b-256 0dde96ff0e13c588125cbae1fb109ae40ef5f4bb54b5264e4e8490a6347901d0

See more details on using hashes here.

File details

Details for the file trustformers-0.1.0a1-cp38-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for trustformers-0.1.0a1-cp38-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0fdcd2c1750194208bd2d1623c47a661d09b060257df613ef99a045e56e1dcec
MD5 099f3b3313b0fd1acedaf69759d832a7
BLAKE2b-256 a41806b973f940900a41528fcb46b5e7022bcf77e199b8ae9faaff31fda27eaf

See more details on using hashes here.

File details

Details for the file trustformers-0.1.0a1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for trustformers-0.1.0a1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 81a3729d2c1334c2ae9a6c899b22646cf6ae9cc1bf46e7d44d2619c73ca7e61b
MD5 ea25f580819092fe3fff1be1a018da4d
BLAKE2b-256 4eb6dc31ec6c8f954fefaa222ec098c0fec63bdd04aa2b34c318c139c35f69b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page