Skip to main content

Provides a buildkit for constructing Triton Inference Server model repositories

Project description

tritonserver-buildkit

A powerful Python toolkit for building, deploying, and testing NVIDIA Triton Inference Server model repositories with ease.

Overview

tsbk (Triton Server Build Kit) simplifies the process of creating and managing Triton Inference Server deployments. It provides both a declarative YAML-based configuration system and a programmatic Python SDK, making it easy to:

  • Build Triton-compatible model repositories from simple configurations
  • Deploy and run Triton servers in Docker containers
  • Test model deployments with built-in validation framework
  • Integrate with mlflow_backend for deploying models registered with MLFlow

Whether you're developing ML models locally or deploying them to production, tsbk streamlines the entire workflow.

Features

  • Declarative Configuration: Define model repositories using simple YAML files
  • Programmatic SDK: Build model repositories programmatically using Python
  • Automatic Testing: Built-in test framework with input/output validation
  • MLflow Integration: Seamlessly load models from MLflow registry
  • S3 Support: Fetch model artifacts from S3-compatible storage
  • Multi-Backend Support: ONNX, TensorRT, Python, ensemble models, and more
  • Docker Integration: Automatic Triton server deployment in containers
  • Test Plan Serialization: Create reusable test plans for CI/CD pipelines
  • HTTP & gRPC Support: Test models via both protocols

Installation

Using pip

pip install tsbk

Requirements

  • Python 3.11+
  • Docker (for running Triton servers)
  • AWS credentials (for S3 model storage, optional)
  • Databricks credentials (for MLflow model access, optional)

Quick Start

CLI: YAML Configuration

  1. Create a model configuration (model-config.yaml):
name: quickstart
models:
  your-model-name:
    backend: onnxruntime
    versions:
      - artifact_uri: s3://your-bucket/model.onnx
  1. Build and run with a single command:
tsbk run model-config.yaml ./model-repo

This will:

  • Build the tritonserver model repository at ./model-repo
  • Launch Triton server in a Docker container

SDK: Python API

import tsbk

# Define your model repository
repo = tsbk.TritonModelRepo(
    name='quickstart',
    path='./model-repo',
    models={
        'your-model-name': tsbk.TritonModel(
            backend='onnxruntime',
            versions=[
                tsbk.TritonModelVersion(
                    artifact_uri='models:/your-onnx-model/1'  # MLflow model URI or S3 URI supported
                )
            ]
        )
    }
)

# Build the repository
repo.build()

# Run Triton server
repo.run()

CLI Commands

tsbk build

Build a Triton model repository from a configuration file.

tsbk build model-config.yaml ./model-repo

tsbk run

Build and run a Triton server from a configuration file.

# Run with defaults
tsbk run model-config.yaml ./model-repo

Optionally detach from the server to run in the background:

# Detach mode
tsbk run model-config.yaml ./model-repo --detach

Optionally run tests after starting the server:

# Run and test
tsbk run model-config.yaml ./model-repo --test

tsbk test

Test a running Triton server against your configuration. This requires providing test cases for your models. They can be specified via YAML or programmatically in the SDK. tsbk also supports taking test data from MLFlow models with defined example_inputs.

# Test HTTP endpoint
tsbk test model-config.yaml ./model-repo --url http://localhost:8000

# Test gRPC endpoint
tsbk test model-config.yaml ./model-repo --url localhost:8001 --grpc

tsbk create-test-plan

Create a serialized test plan for CI/CD pipelines.

tsbk create-test-plan model-config.yaml ./model-repo test-plan.msgpack

tsbk run-test-plan

Execute a serialized test plan against a running server.

tsbk run-test-plan test-plan.msgpack --url http://localhost:8000

Examples

Explore the examples/ directory for complete working examples:

Environment Variables

  • TSBK_DIR: Working directory for tsbk operations (default: ./.tsbk)
  • TSBK_S3_PREFIX: S3 prefix for temporary shared model artifacts
  • TSBK_K8S_SERVICE_ACCOUNT: Kubernetes service account used when running build jobs (default: default)

Troubleshooting

Docker Issues

If Triton server fails to start:

  • Ensure Docker is running
  • Check port availability (8000, 8001, 8002)
  • Verify Docker has sufficient resources (memory, disk space)

Model Loading Errors

If models fail to load:

  • Verify artifact URIs are correct
  • Ensure model format matches backend (e.g., .onnx for onnxruntime)
  • Review Triton server logs: docker logs <container-id>

Test Failures

If tests fail unexpectedly:

  • Check tolerance settings (rtol, atol)
  • Verify input/output tensor shapes match model expectations
  • Review expected output values

Acknowledgments

Roadmap

Future enhancements we're considering:

  • Kubernetes deployment support
  • TensorRT model optimization utilities
  • Performance benchmarking tools
  • Direct Hugging Face model support

Suggestions and contributions are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsbk-1.9.0.tar.gz (33.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tsbk-1.9.0-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file tsbk-1.9.0.tar.gz.

File metadata

  • Download URL: tsbk-1.9.0.tar.gz
  • Upload date:
  • Size: 33.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tsbk-1.9.0.tar.gz
Algorithm Hash digest
SHA256 80010d282f2edf34aa2c75496b0113bf854f8bf443feaf4e5d2d6552b3b44a48
MD5 813e23c1bb1b354c2305a2feacb98bd5
BLAKE2b-256 c8f5471056b255e4ba7b9b727b540b1c9421e7bedacb94915f1820856d771fa0

See more details on using hashes here.

Provenance

The following attestation bundles were made for tsbk-1.9.0.tar.gz:

Publisher: semantic_release.yaml on wwgrainger/tritonserver-buildkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tsbk-1.9.0-py3-none-any.whl.

File metadata

  • Download URL: tsbk-1.9.0-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tsbk-1.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8b8ecfdbe49270a1c0a2b8617cfd0538029b8b1a020773db2f0121902ad61967
MD5 be07760443548a20559a79d83d296138
BLAKE2b-256 cf1a5b5dd98d508b6dd75a87042a3c29b5ac1024b0ff36e78b890b1522c6b05d

See more details on using hashes here.

Provenance

The following attestation bundles were made for tsbk-1.9.0-py3-none-any.whl:

Publisher: semantic_release.yaml on wwgrainger/tritonserver-buildkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page