Provides a tritonserver backend for running MLFlow models

These details have not been verified by PyPI

Project description

MLflow Backend for Triton Inference Server

mlflow-backend brings MLflow-managed models to NVIDIA Triton Inference Server without re-implementing serving code. The backend inspects the MLflow model metadata at load time, selects the right execution strategy, and exposes the model behind Triton's Python backend interface. This repository also ships helper tooling that automates creation of the artifacts Triton expects (Python backend stub, conda execution environment archive, and model config.pbtxt).

Highlights

Auto-detects MLflow model flavors including generic pyfunc and Hugging Face transformers and sentence_transformers, parsing their input/output schemas for use in Triton.
Produces Triton-ready configuration files, conda-pack execution environments, and Python backend stubs through the mlflow-backend-utils CLI.
Tested with Triton 25.08 (although the library should support a wide range of Tritonserver versions) and Python 3.12+, with tests covering unit, integration, and end-to-end packaging flows.

Local Installation

pip install mlflow_backend

The CLI is exposed as mlflow-backend-utils once the package is installed.

Installing inside Tritonserver Container

To use the backend inside Tritonserver's Python backend, the backend must be installed into the container image at /opt/tritonserver/backends/mlflow. This can be done a number of ways:

Build a custom Docker image that includes the backend.

FROM nvcr.io/nvidia/tritonserver:25.08-py3
RUN git clone https://github.com/wwg-internal/mlflow_backend.git && \
   mv ./mlflow_backend/src/mlflow_backend /opt/tritonserver/backends/mlflow && \
   rm -rf ./mlflow_backend

Mount the backend code into the container at runtime (see example below).

mlflow_backend_path=$(python -c "import mlflow_backend; from pathlib import Path; print(Path(mlflow_backend.__file__).parent.absolute())")
docker run --rm -p8000:8000 -p8001:8001 -p8002:8002 \
  -v ./model-repo:/models \
  -v ${mlflow_backend_path}:/opt/tritonserver/backends/mlflow \
  nvcr.io/nvidia/tritonserver:25.08-py3 \
  --model-repository=/models

Install the backend at container startup, for example using git:

git clone https://github.com/wwg-internal/mlflow_backend.git
mv ./mlflow_backend/src/mlflow_backend /opt/tritonserver/backends/mlflow
rm -rf ./mlflow_backend
tritonserver --model-repository=/models

General Usage: Serve an MLflow model with Triton

Export your MLflow model – have a local copy of the MLflow model directory (contains MLmodel, artifacts, and optional Python environment files).
Build the Python backend stub – required when your model specifies a different Python version than the tritonserver default (3.12).
```
mlflow-backend-utils build-stub \
  --python-version 3.13 \
  --triton-version r25.08 \
  --output-path triton_python_backend_stub
```
This defaults to using Docker (nvcr.io/nvidia/tritonserver:<version>-py3). A Kubernetes-based builder is also available for environments where Docker is not an option.
Create a conda-pack execution environment (optional but recommended for non-system dependencies).
```
mlflow-backend-utils build-env \
  --python-version 3.13 \
  --requirements path/to/requirements.txt \
  --output-path conda-pack.tar.gz
```
The tool will attempt to build with a Docker container first, falling back to using local conda build when possible.

Generate the Triton model configuration.

mlflow-backend-utils build-config \
  --model-path path/to/mlflow_model \
  --conda-pack-path conda-pack.tar.gz \
  --default-max-batch-size 1024 > config.pbtxt

Assemble the Triton model repository following the layout below:

triton-repo/
  my_model/
    config.pbtxt
    triton_python_backend_stub
    conda-pack.tar.gz
    1/
      MLmodel
      python_env.yaml
      model artifacts ...

Multiple versions can be added as folders (2/, 3/, …) as usual for Triton.

Start Triton pointing at the repository.

backend_path=$(python -c "import mlflow_backend; from pathlib import Path; print(Path(mlflow_backend.__file__).parent.absolute())")
docker run --rm -p8000:8000 -p8001:8001 -p8002:8002 \
  -v ./triton-repo:/models \
  -v ${backend_path}:/opt/tritonserver/backends/mlflow \
  nvcr.io/nvidia/tritonserver:25.08-py3 \
  tritonserver --model-repository=/models

Once Triton loads your model, send requests using the standard Triton HTTP/gRPC clients. The backend converts Triton tensors (strings, numpy arrays, Pandas data frames, Torch tensors) into what your MLflow model expects and returns results in Triton's format.

CLI Reference (`mlflow-backend-utils`)

build-env: Creates a conda-pack.tar.gz suitable for uploading alongside the model. Supports Linux x86/ARM targets, Docker-based or local builds, and custom package indexes.
build-stub: Compiles the Triton Python backend stub for the requested Python/Triton version pair. Supports Docker and Kubernetes builders and can publish the artifact to S3 from cluster jobs.
build-config: Reads the MLflow MLmodel metadata, infers signatures, and emits an editable config.pbtxt. Special handling is included for transformer models whose schemas are dynamic.

Run mlflow-backend-utils <command> --help for full option lists.

Supported MLflow Flavors

Generic pyfunc models (numpy arrays, Pandas DataFrames/Series, dictionaries, and lists).
Hugging Face transformers saved through mlflow.transformers, including sequence classification, token classification, QA, text generation, and translation pipelines.
sentence_transformers models with GPU/CPU auto-detection.
PyTorch models logged via MLflow's pytorch flavor (loaded through pyfunc).

If your flavor is not listed, the backend defaults to the general pyfunc adapter and raises clear errors when an unsupported return type is encountered.

Testing & Development

All tests can be run through nox (install with pip install nox):

# List all available nox sessions
nox -l

# Run unit tests
nox -s unit_tests

# Run integration tests
# Requires docker and [kind](https://kind.sigs.k8s.io/) and kubectl installed
kind create cluster --config ./tests/integration/manifests/kind.yaml
kubectl apply -f ./tests/integration/manifests/localstack.yaml
nox -s integration_tests

# Run e2e python tests
# These tests run the test models inside a python environment with a specified mlflow versions installed
nox -s "e2e_python_tests(mlflow='2.22.1')"

# Run e2e triton tests
# These tests run the test models inside a tritonserver container with a specified python version
nox -s "e2e_triton_tests(python_version='3.13')"

Contributing

We are actively growing mlflow_backend and would love your help. Please read the contributing guide for details on workflows, Conventional Commit requirements, and the expectation that every feature or bug fix ships with tests covering the relevant edge cases. Issues and pull requests that outline Triton or MLflow version constraints are especially helpful while the project is young.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.1.0

Nov 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlflow_backend-1.1.0.tar.gz (21.2 kB view details)

Uploaded Nov 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlflow_backend-1.1.0-py3-none-any.whl (26.3 kB view details)

Uploaded Nov 8, 2025 Python 3

File details

Details for the file mlflow_backend-1.1.0.tar.gz.

File metadata

Download URL: mlflow_backend-1.1.0.tar.gz
Upload date: Nov 8, 2025
Size: 21.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlflow_backend-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`326d6989ed652d7d4d9512de96204dae026dded5d75ad045c21d1617211abba8`
MD5	`6bde19d02b819ea1f83013f8c4f626f4`
BLAKE2b-256	`f0b161b6f98ab4d3bc89596e4cce19b80dcb33ad9ecaf0d74ec2abb3ce7f4e38`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlflow_backend-1.1.0.tar.gz:

Publisher: semantic_release.yaml on wwgrainger/mlflow_backend

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlflow_backend-1.1.0.tar.gz
- Subject digest: 326d6989ed652d7d4d9512de96204dae026dded5d75ad045c21d1617211abba8
- Sigstore transparency entry: 685370905
- Sigstore integration time: Nov 8, 2025
Source repository:
- Permalink: wwgrainger/mlflow_backend@12f37d4020a78fcfbfa9b69e7fe732806e90a5c8
- Branch / Tag: refs/heads/main
- Owner: https://github.com/wwgrainger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: semantic_release.yaml@12f37d4020a78fcfbfa9b69e7fe732806e90a5c8
- Trigger Event: workflow_dispatch

File details

Details for the file mlflow_backend-1.1.0-py3-none-any.whl.

File metadata

Download URL: mlflow_backend-1.1.0-py3-none-any.whl
Upload date: Nov 8, 2025
Size: 26.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlflow_backend-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`21e20b32c1d9cd35b202a543a53a60598e32e30a256011c065501767edfeee62`
MD5	`1daeefbcd4a05b4c526966b0a14a7798`
BLAKE2b-256	`3aea18d11bc20086b4d4bcaded2499218bb72b56a6864a38603222dd6f4c0f33`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlflow_backend-1.1.0-py3-none-any.whl:

Publisher: semantic_release.yaml on wwgrainger/mlflow_backend

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlflow_backend-1.1.0-py3-none-any.whl
- Subject digest: 21e20b32c1d9cd35b202a543a53a60598e32e30a256011c065501767edfeee62
- Sigstore transparency entry: 685370910
- Sigstore integration time: Nov 8, 2025
Source repository:
- Permalink: wwgrainger/mlflow_backend@12f37d4020a78fcfbfa9b69e7fe732806e90a5c8
- Branch / Tag: refs/heads/main
- Owner: https://github.com/wwgrainger
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: semantic_release.yaml@12f37d4020a78fcfbfa9b69e7fe732806e90a5c8
- Trigger Event: workflow_dispatch

mlflow-backend 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

MLflow Backend for Triton Inference Server

Highlights

Local Installation

Installing inside Tritonserver Container

General Usage: Serve an MLflow model with Triton

CLI Reference (`mlflow-backend-utils`)

Supported MLflow Flavors

Testing & Development

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

mlflow-backend 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

MLflow Backend for Triton Inference Server

Highlights

Local Installation

Installing inside Tritonserver Container

General Usage: Serve an MLflow model with Triton

CLI Reference (mlflow-backend-utils)

Supported MLflow Flavors

Testing & Development

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

CLI Reference (`mlflow-backend-utils`)