Provides a buildkit for constructing Triton Inference Server model repositories
Project description
tritonserver-buildkit
A powerful Python toolkit for building, deploying, and testing NVIDIA Triton Inference Server model repositories with ease.
Overview
tsbk (Triton Server Build Kit) simplifies the process of creating and managing Triton Inference Server deployments. It provides both a declarative YAML-based configuration system and a programmatic Python SDK, making it easy to:
- Build Triton-compatible model repositories from simple configurations
- Deploy and run Triton servers in Docker containers
- Test model deployments with built-in validation framework
- Integrate with mlflow_backend for deploying models registered with MLFlow
Whether you're developing ML models locally or deploying them to production, tsbk streamlines the entire workflow.
Features
- Declarative Configuration: Define model repositories using simple YAML files
- Programmatic SDK: Build model repositories programmatically using Python
- Automatic Testing: Built-in test framework with input/output validation
- MLflow Integration: Seamlessly load models from MLflow registry
- S3 Support: Fetch model artifacts from S3-compatible storage
- Multi-Backend Support: ONNX, TensorRT, Python, ensemble models, and more
- Docker Integration: Automatic Triton server deployment in containers
- Test Plan Serialization: Create reusable test plans for CI/CD pipelines
- HTTP & gRPC Support: Test models via both protocols
Installation
Using pip
pip install tsbk
Requirements
- Python 3.11+
- Docker (for running Triton servers)
- AWS credentials (for S3 model storage, optional)
- Databricks credentials (for MLflow model access, optional)
Quick Start
CLI: YAML Configuration
- Create a model configuration (
model-config.yaml):
name: quickstart
models:
your-model-name:
backend: onnxruntime
versions:
- artifact_uri: s3://your-bucket/model.onnx
- Build and run with a single command:
tsbk run model-config.yaml ./model-repo
This will:
- Build the tritonserver model repository at
./model-repo - Launch Triton server in a Docker container
SDK: Python API
import tsbk
# Define your model repository
repo = tsbk.TritonModelRepo(
name='quickstart',
path='./model-repo',
models={
'your-model-name': tsbk.TritonModel(
backend='onnxruntime',
versions=[
tsbk.TritonModelVersion(
artifact_uri='models:/your-onnx-model/1' # MLflow model URI or S3 URI supported
)
]
)
}
)
# Build the repository
repo.build()
# Run Triton server
repo.run()
CLI Commands
tsbk build
Build a Triton model repository from a configuration file.
tsbk build model-config.yaml ./model-repo
tsbk run
Build and run a Triton server from a configuration file.
# Run with defaults
tsbk run model-config.yaml ./model-repo
Optionally detach from the server to run in the background:
# Detach mode
tsbk run model-config.yaml ./model-repo --detach
Optionally run tests after starting the server:
# Run and test
tsbk run model-config.yaml ./model-repo --test
tsbk test
Test a running Triton server against your configuration. This requires providing test cases for your models.
They can be specified via YAML or programmatically in the SDK. tsbk also supports taking test data from MLFlow models with defined example_inputs.
# Test HTTP endpoint
tsbk test model-config.yaml ./model-repo --url http://localhost:8000
# Test gRPC endpoint
tsbk test model-config.yaml ./model-repo --url localhost:8001 --grpc
tsbk create-test-plan
Create a serialized test plan for CI/CD pipelines.
tsbk create-test-plan model-config.yaml ./model-repo test-plan.msgpack
tsbk run-test-plan
Execute a serialized test plan against a running server.
tsbk run-test-plan test-plan.msgpack --url http://localhost:8000
Examples
Explore the examples/ directory for complete working examples:
- Config-based deployment: Using YAML configuration files
- SDK-based deployment: Using the Python SDK programmatically
Environment Variables
TSBK_DIR: Working directory for tsbk operations (default:./.tsbk)TSBK_S3_PREFIX: S3 prefix for temporary shared model artifactsTSBK_K8S_SERVICE_ACCOUNT: Kubernetes service account used when running build jobs (default:default)
Troubleshooting
Docker Issues
If Triton server fails to start:
- Ensure Docker is running
- Check port availability (8000, 8001, 8002)
- Verify Docker has sufficient resources (memory, disk space)
Model Loading Errors
If models fail to load:
- Verify artifact URIs are correct
- Ensure model format matches backend (e.g.,
.onnxfor onnxruntime) - Review Triton server logs:
docker logs <container-id>
Test Failures
If tests fail unexpectedly:
- Check tolerance settings (
rtol,atol) - Verify input/output tensor shapes match model expectations
- Review expected output values
Acknowledgments
- Built on NVIDIA Triton Inference Server
- Integrates with MLflow for model management
Roadmap
Future enhancements we're considering:
- Kubernetes deployment support
- TensorRT model optimization utilities
- Performance benchmarking tools
- Direct Hugging Face model support
Suggestions and contributions are welcome!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tsbk-1.9.0.tar.gz.
File metadata
- Download URL: tsbk-1.9.0.tar.gz
- Upload date:
- Size: 33.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80010d282f2edf34aa2c75496b0113bf854f8bf443feaf4e5d2d6552b3b44a48
|
|
| MD5 |
813e23c1bb1b354c2305a2feacb98bd5
|
|
| BLAKE2b-256 |
c8f5471056b255e4ba7b9b727b540b1c9421e7bedacb94915f1820856d771fa0
|
Provenance
The following attestation bundles were made for tsbk-1.9.0.tar.gz:
Publisher:
semantic_release.yaml on wwgrainger/tritonserver-buildkit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tsbk-1.9.0.tar.gz -
Subject digest:
80010d282f2edf34aa2c75496b0113bf854f8bf443feaf4e5d2d6552b3b44a48 - Sigstore transparency entry: 1203533493
- Sigstore integration time:
-
Permalink:
wwgrainger/tritonserver-buildkit@df0530f6231a6d75261577ae086bf76f71e25441 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/wwgrainger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
semantic_release.yaml@df0530f6231a6d75261577ae086bf76f71e25441 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tsbk-1.9.0-py3-none-any.whl.
File metadata
- Download URL: tsbk-1.9.0-py3-none-any.whl
- Upload date:
- Size: 39.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b8ecfdbe49270a1c0a2b8617cfd0538029b8b1a020773db2f0121902ad61967
|
|
| MD5 |
be07760443548a20559a79d83d296138
|
|
| BLAKE2b-256 |
cf1a5b5dd98d508b6dd75a87042a3c29b5ac1024b0ff36e78b890b1522c6b05d
|
Provenance
The following attestation bundles were made for tsbk-1.9.0-py3-none-any.whl:
Publisher:
semantic_release.yaml on wwgrainger/tritonserver-buildkit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tsbk-1.9.0-py3-none-any.whl -
Subject digest:
8b8ecfdbe49270a1c0a2b8617cfd0538029b8b1a020773db2f0121902ad61967 - Sigstore transparency entry: 1203533495
- Sigstore integration time:
-
Permalink:
wwgrainger/tritonserver-buildkit@df0530f6231a6d75261577ae086bf76f71e25441 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/wwgrainger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
semantic_release.yaml@df0530f6231a6d75261577ae086bf76f71e25441 -
Trigger Event:
workflow_dispatch
-
Statement type: