Skip to main content

Provides a buildkit for constructing Triton Inference Server model repositories

Project description

tritonserver-buildkit

A powerful Python toolkit for building, deploying, and testing NVIDIA Triton Inference Server model repositories with ease.

Overview

tsbk (Triton Server Build Kit) simplifies the process of creating and managing Triton Inference Server deployments. It provides both a declarative YAML-based configuration system and a programmatic Python SDK, making it easy to:

  • Build Triton-compatible model repositories from simple configurations
  • Deploy and run Triton servers in Docker containers
  • Test model deployments with built-in validation framework
  • Integrate with mlflow_backend for deploying models registered with MLFlow

Whether you're developing ML models locally or deploying them to production, tsbk streamlines the entire workflow.

Features

  • Declarative Configuration: Define model repositories using simple YAML files
  • Programmatic SDK: Build model repositories programmatically using Python
  • Automatic Testing: Built-in test framework with input/output validation
  • MLflow Integration: Seamlessly load models from MLflow registry
  • S3 Support: Fetch model artifacts from S3-compatible storage
  • Multi-Backend Support: ONNX, TensorRT, Python, ensemble models, and more
  • Docker Integration: Automatic Triton server deployment in containers
  • Test Plan Serialization: Create reusable test plans for CI/CD pipelines
  • HTTP & gRPC Support: Test models via both protocols

Installation

Using pip

pip install tsbk

Requirements

  • Python 3.11+
  • Docker (for running Triton servers)
  • AWS credentials (for S3 model storage, optional)
  • Databricks credentials (for MLflow model access, optional)

Quick Start

CLI: YAML Configuration

  1. Create a model configuration (model-config.yaml):
name: quickstart
models:
  your-model-name:
    backend: onnxruntime
    versions:
      - artifact_uri: s3://your-bucket/model.onnx
  1. Build and run with a single command:
tsbk run model-config.yaml ./model-repo

This will:

  • Build the tritonserver model repository at ./model-repo
  • Launch Triton server in a Docker container

SDK: Python API

import tsbk

# Define your model repository
repo = tsbk.TritonModelRepo(
    name='quickstart',
    path='./model-repo',
    models={
        'your-model-name': tsbk.TritonModel(
            backend='onnxruntime',
            versions=[
                tsbk.TritonModelVersion(
                    artifact_uri='models:/your-onnx-model/1'  # MLflow model URI or S3 URI supported
                )
            ]
        )
    }
)

# Build the repository
repo.build()

# Run Triton server
repo.run()

CLI Commands

tsbk build

Build a Triton model repository from a configuration file.

tsbk build model-config.yaml ./model-repo

tsbk run

Build and run a Triton server from a configuration file.

# Run with defaults
tsbk run model-config.yaml ./model-repo

Optionally detach from the server to run in the background:

# Detach mode
tsbk run model-config.yaml ./model-repo --detach

Optionally run tests after starting the server:

# Run and test
tsbk run model-config.yaml ./model-repo --test

tsbk test

Test a running Triton server against your configuration. This requires providing test cases for your models. They can be specified via YAML or programmatically in the SDK. tsbk also supports taking test data from MLFlow models with defined example_inputs.

# Test HTTP endpoint
tsbk test model-config.yaml ./model-repo --url http://localhost:8000

# Test gRPC endpoint
tsbk test model-config.yaml ./model-repo --url localhost:8001 --grpc

tsbk create-test-plan

Create a serialized test plan for CI/CD pipelines.

tsbk create-test-plan model-config.yaml ./model-repo test-plan.msgpack

tsbk run-test-plan

Execute a serialized test plan against a running server.

tsbk run-test-plan test-plan.msgpack --url http://localhost:8000

Examples

Explore the examples/ directory for complete working examples:

Environment Variables

  • TSBK_DIR: Working directory for tsbk operations (default: ./.tsbk)
  • TSBK_S3_PREFIX: S3 prefix for temporary shared model artifacts
  • TSBK_K8S_SERVICE_ACCOUNT: Kubernetes service account used when running build jobs (default: default)

Troubleshooting

Docker Issues

If Triton server fails to start:

  • Ensure Docker is running
  • Check port availability (8000, 8001, 8002)
  • Verify Docker has sufficient resources (memory, disk space)

Model Loading Errors

If models fail to load:

  • Verify artifact URIs are correct
  • Ensure model format matches backend (e.g., .onnx for onnxruntime)
  • Review Triton server logs: docker logs <container-id>

Test Failures

If tests fail unexpectedly:

  • Check tolerance settings (rtol, atol)
  • Verify input/output tensor shapes match model expectations
  • Review expected output values

Acknowledgments

Roadmap

Future enhancements we're considering:

  • Kubernetes deployment support
  • TensorRT model optimization utilities
  • Performance benchmarking tools
  • Direct Hugging Face model support

Suggestions and contributions are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsbk-1.8.1.tar.gz (33.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tsbk-1.8.1-py3-none-any.whl (39.8 kB view details)

Uploaded Python 3

File details

Details for the file tsbk-1.8.1.tar.gz.

File metadata

  • Download URL: tsbk-1.8.1.tar.gz
  • Upload date:
  • Size: 33.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tsbk-1.8.1.tar.gz
Algorithm Hash digest
SHA256 15375bc552f09dd5199d77b44131c00375e5867a7ec848f2563b7eb2896ab5ec
MD5 aa70c16eccf228401493cebdfadfdf7e
BLAKE2b-256 0c721def8790cef198247529aef13cbda5e2623f2fcdcde32b63f8f082558d15

See more details on using hashes here.

Provenance

The following attestation bundles were made for tsbk-1.8.1.tar.gz:

Publisher: semantic_release.yaml on wwgrainger/tritonserver-buildkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tsbk-1.8.1-py3-none-any.whl.

File metadata

  • Download URL: tsbk-1.8.1-py3-none-any.whl
  • Upload date:
  • Size: 39.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tsbk-1.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 863b169cf7508d930c0b0b4c7ce7e5fdf3558112cfb0a755d35d9412295a9d1f
MD5 fa1bbf600cd4c3b4eaf2e9d8b486ce41
BLAKE2b-256 fc21dd64be1949c2c4343228ab6995713457bb076020ab85f45c784f29f7d496

See more details on using hashes here.

Provenance

The following attestation bundles were made for tsbk-1.8.1-py3-none-any.whl:

Publisher: semantic_release.yaml on wwgrainger/tritonserver-buildkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page