Skip to main content

Python bindings for encoderfile.

Project description

Project logo

🚀 Overview

Encoderfile packages transformer encoders—optionally with classification heads—into a single, self-contained executable. No Python runtime, no dependencies, no network calls. Just a fast, portable binary that runs anywhere.

While Llamafile focuses on generative models, Encoderfile is purpose-built for encoder architectures with optional classification heads. It supports embedding, sequence classification, and token classification models—covering most encoder-based NLP tasks, from text similarity to classification and tagging—all within one compact binary.

Under the hood, Encoderfile uses ONNX Runtime for inference, ensuring compatibility with a wide range of transformer architectures.

Why?

  • Smaller footprint: a single binary measured in tens-to-hundreds of megabytes, not gigabytes of runtime and packages
  • Compliance-friendly: deterministic, offline, security-boundary-safe
  • Integration-ready: drop into existing systems as a CLI, microservice, or API without refactoring your stack

Encoderfiles can run as:

  • REST API
  • gRPC microservice
  • CLI for batch processing
  • MCP server (Model Context Protocol)

Architecture Diagram

Supported Architectures

Encoderfile supports the following Hugging Face model classes (and their ONNX-exported equivalents):

Task Supported classes Example models
Embeddings / Feature Extraction AutoModel, AutoModelForMaskedLM bert-base-uncased, distilbert-base-uncased
Sequence Classification AutoModelForSequenceClassification distilbert-base-uncased-finetuned-sst-2-english, roberta-large-mnli
Token Classification AutoModelForTokenClassification dslim/bert-base-NER, bert-base-cased-finetuned-conll03-english
  • ✅ All architectures must be encoder-only transformers — no decoders, no encoder–decoder hybrids (so no T5, no BART).
  • ⚙️ Models must have ONNX-exported weights (path/to/your/model/model.onnx).
  • 🧠 The ONNX graph input must include input_ids and optionally attention_mask.
  • 🚫 Models relying on generation heads (AutoModelForSeq2SeqLM, AutoModelForCausalLM, etc.) are not supported.
  • XLNet, Transformer XL, and derivative architectures are not yet supported.

📦 Installation

Option 1: Download Pre-built CLI Tool (Recommended)

Download the encoderfile CLI tool to build your own model binaries:

curl -fsSL https://raw.githubusercontent.com/mozilla-ai/encoderfile/main/install.sh | sh

Note for Windows users: Pre-built binaries are not available for Windows. Please see our guide on building from source for instructions on building from source.

Move the binary to a location in your PATH:

# Linux/macOS
sudo mv encoderfile /usr/local/bin/

# Or add to your user bin
mkdir -p ~/.local/bin
mv encoderfile ~/.local/bin/

Option 2: Build CLI Tool from Source

See our guide on building from source for detailed instructions on building the CLI tool from source.

Quick build:

cargo build --bin encoderfile --release
./target/release/encoderfile --help

Option 3: Python Package

Install the Python library to build encoderfiles programmatically:

pip install encoderfile
# or with uv
uv add encoderfile
from encoderfile import EncoderfileBuilder, ModelType

builder = EncoderfileBuilder(
    name="sentiment-analyzer",
    model_type=ModelType.SequenceClassification,
    path="./sentiment-model",
)
builder.build()

The package also provides a CLI entry point:

uv run -m encoderfile build -f config.yml

See the Python Library docs for the full guide and API reference.

🚀 Quick Start

Step 1: Prepare Your Model

First, you need an ONNX-exported model. Export any HuggingFace model:

Requires Python 3.13+ for ONNX export

# Install optimum for ONNX export
pip install optimum[onnx]

# Export a sentiment analysis model
optimum-cli export onnx \
  --model distilbert-base-uncased-finetuned-sst-2-english \
  --task text-classification \
  ./sentiment-model

Step 2: Create Configuration File

Create sentiment-config.yml:

encoderfile:
  name: sentiment-analyzer
  path: ./sentiment-model
  model_type: sequence_classification
  output_path: ./build/sentiment-analyzer.encoderfile

Step 3: Build Your Encoderfile

Use the downloaded encoderfile CLI tool:

encoderfile build -f sentiment-config.yml

This creates a self-contained binary at ./build/sentiment-analyzer.encoderfile.

Step 4: Run Your Model

Start the server:

./build/sentiment-analyzer.encoderfile serve

The server will start on http://localhost:8080 by default.

Making Predictions

Sentiment Analysis:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [
      "This is the cutest cat ever!",
      "Boring video, waste of time",
      "These cats are so funny!"
    ]
  }'

Response:

{
  "results": [
    {
      "logits": [0.00021549065, 0.9997845],
      "scores": [0.00021549074, 0.9997845],
      "predicted_index": 1,
      "predicted_label": "POSITIVE"
    },
    {
      "logits": [0.9998148, 0.00018516644],
      "scores": [0.9998148, 0.0001851664],
      "predicted_index": 0,
      "predicted_label": "NEGATIVE"
    },
    {
      "logits": [0.00014975034, 0.9998503],
      "scores": [0.00014975043, 0.9998503],
      "predicted_index": 1,
      "predicted_label": "POSITIVE"
    }
  ],
  "model_id": "sentiment-analyzer"
}

Embeddings:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": ["Hello world"],
    "normalize": true
  }'

Token Classification (NER):

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": ["Apple Inc. is located in Cupertino, California"]
  }'

🎯 Usage Modes

Mode Command Default
REST API ./my-model.encoderfile serve http://localhost:8080
gRPC ./my-model.encoderfile serve localhost:50051
CLI ./my-model.encoderfile infer "text" stdout
MCP Server ./my-model.encoderfile mcp

Both HTTP and gRPC servers start by default. Use --disable-grpc or --disable-http to run only one.

See the CLI Reference for all server options, port configuration, and output formats.

📚 Documentation

🛠️ Building Custom Encoderfiles

Once you have the encoderfile CLI tool installed, you can build binaries from any compatible HuggingFace model.

See our guide on building from source for detailed instructions including:

  • How to export models to ONNX format
  • Configuration file options
  • Advanced features (Lua transforms, custom paths, etc.)
  • Troubleshooting tips

Quick workflow:

  1. Export your model to ONNX: optimum-cli export onnx ...
  2. Create a config file: config.yml
  3. Build the binary: encoderfile build -f config.yml
  4. Deploy anywhere: ./build/my-model.encoderfile serve

See our guide on building from source for detailed instructions.

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Development Setup

Make sure you have Just installed.

# Clone the repository
git clone https://github.com/mozilla-ai/encoderfile.git
cd encoderfile

# Set up development environment
just setup

# Run tests
just test

# Build documentation 
just docs

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

💬 Community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_x86_64.whl (14.1 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.38+ x86-64

encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_aarch64.whl (13.5 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.38+ ARM64

encoderfile-0.6.2rc2-cp313-abi3-macosx_11_0_arm64.whl (11.6 MB view details)

Uploaded CPython 3.13+macOS 11.0+ ARM64

encoderfile-0.6.2rc2-cp313-abi3-macosx_10_12_x86_64.whl (13.0 MB view details)

Uploaded CPython 3.13+macOS 10.12+ x86-64

File details

Details for the file encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_x86_64.whl.

File metadata

  • Download URL: encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_x86_64.whl
  • Upload date:
  • Size: 14.1 MB
  • Tags: CPython 3.13+, manylinux: glibc 2.38+ x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 6d36b3a9b4bb678ff28cf0aadb981fa2dc0e25ade8a3f3b2c0d24725ba395721
MD5 2d77e381d53e2cc155df2a5fb57e5063
BLAKE2b-256 4a7ae39b69f22e8974d005244364e02d642832b2bf8f9d8218ec1cdd2bea0d6e

See more details on using hashes here.

File details

Details for the file encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_aarch64.whl.

File metadata

  • Download URL: encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_aarch64.whl
  • Upload date:
  • Size: 13.5 MB
  • Tags: CPython 3.13+, manylinux: glibc 2.38+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for encoderfile-0.6.2rc2-cp313-abi3-manylinux_2_38_aarch64.whl
Algorithm Hash digest
SHA256 a4718ac0900bcfb5450237e179256ed3fab97759a6d1fc25aadd113b8391ae25
MD5 07a4db98a45e3b398d77d12ee5bed609
BLAKE2b-256 8e111b6f5c93b924fe09925e2eface448da4c657b20b1b53b730e6a7d1714b45

See more details on using hashes here.

File details

Details for the file encoderfile-0.6.2rc2-cp313-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: encoderfile-0.6.2rc2-cp313-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 11.6 MB
  • Tags: CPython 3.13+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for encoderfile-0.6.2rc2-cp313-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ed127bdebe70484d53bee306dc5f383829d8325f4d82816531512c3ab24dbe5d
MD5 37650398b4fe3044c637c8512910fa2a
BLAKE2b-256 f0b4a4c7b8e6618a469d4ff3a69822e62f52553274b816709a2698b140a1905f

See more details on using hashes here.

File details

Details for the file encoderfile-0.6.2rc2-cp313-abi3-macosx_10_12_x86_64.whl.

File metadata

  • Download URL: encoderfile-0.6.2rc2-cp313-abi3-macosx_10_12_x86_64.whl
  • Upload date:
  • Size: 13.0 MB
  • Tags: CPython 3.13+, macOS 10.12+ x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for encoderfile-0.6.2rc2-cp313-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b926011f3f2ccf37ae17184b1097e7a138baba38d71d555e5bca4b4fb36646bf
MD5 382e08f427eeb51b168657b6e7282b15
BLAKE2b-256 4055633fb13941b6275d9d9db2dee6321c0e0e2b37e5d528fbc6a1d7a6e4217e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page