CLI tool and library to export ML models to production formats and containerize them with Docker
Project description
anydeploy
Export, serve, and containerize any ML model — plus auto-generate MCP servers for AI agents.
anydeploy is the last-mile deployment toolkit for ML models. It exports PyTorch or sklearn models to ONNX, TorchScript, or TFLite with smart defaults; generates a FastAPI server with health checks and OpenAPI docs; auto-creates a Model Context Protocol (MCP) server so any AI agent (Claude Desktop, Continue, Cursor) can call your model as a tool; and produces Dockerfiles + requirements files for reproducible deployment. Three deployment profiles (edge, balanced, quality) pick quantization and precision for you.
Built by Viet-Anh Nguyen at NRL.ai.
Why anydeploy?
- One-liner API —
anydeploy.export(model, "onnx")handles shape inference, opset, and validation - Plugin architecture — Register custom exporters, servers, or container targets
- Local-first — Everything runs on your machine; no cloud account needed
- Minimal core deps — Base install has zero heavy deps; torch/tf are optional
- Production-ready — MCP integration, FastAPI generation, Dockerfile scaffolding
Installation
pip install anydeploy
For optional features:
pip install anydeploy[onnx] # ONNX export + onnxruntime verification
pip install anydeploy[torch] # TorchScript export
pip install anydeploy[tflite] # TFLite conversion
pip install anydeploy[serve] # FastAPI + uvicorn server
pip install anydeploy[mcp] # Model Context Protocol server generation
pip install anydeploy[all] # everything
Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)
Quick Start
import anydeploy
import torch
model = torch.load("resnet50.pt").eval()
# 1. Export to ONNX with smart defaults (opset, dynamic axes, validation)
anydeploy.export(
model,
format="onnx",
out="resnet50.onnx",
example_input=torch.randn(1, 3, 224, 224),
profile="balanced", # edge | balanced | quality
)
# 2. Generate a FastAPI server with health check + OpenAPI docs
anydeploy.serve("resnet50.onnx", host="0.0.0.0", port=8000)
# 3. Generate an MCP server so Claude Desktop / Cursor can call the model
anydeploy.mcp("resnet50.onnx", out="my_mcp_server/", name="image-classifier")
# 4. Generate a Dockerfile + requirements.txt for reproducible deployment
anydeploy.containerize("resnet50.onnx", out="docker/", base="python:3.11-slim")
Models & Methods
Export formats
| Format | How it works | Notes |
|---|---|---|
| ONNX | torch.onnx.export with auto-derived dynamic axes + opset 17 defaults |
Validates via onnxruntime after export |
| TorchScript | torch.jit.trace (default) or torch.jit.script |
Python-free runtime |
| TFLite | torch -> onnx -> tf -> tflite via onnx-tf + TensorFlow converter |
Mobile / embedded |
All exports include automatic shape inference, input/output naming, and a round-trip validation step that runs a dummy input through both the original and the exported model and compares outputs.
Deployment profiles
| Profile | Precision | Quantization | Intended target |
|---|---|---|---|
edge |
int8 | Post-training static quantization | Raspberry Pi, phones, MCUs |
balanced (default) |
fp16 | Optional fp16 conversion | Laptop / workstation CPU |
quality |
fp32 | None | Server / GPU inference |
FastAPI server generation
anydeploy.serve(model_path) generates and launches a FastAPI app with:
POST /predict— accepts JSON or multipart image uploadGET /health— liveness checkGET /docs— interactive OpenAPI UI (Swagger)- Automatic request/response Pydantic schemas inferred from the model's input/output shapes
- Optional batching, CORS, and API-key authentication
MCP (Model Context Protocol) server generation
anydeploy.mcp(model_path, name=...) generates a complete MCP server implementation that exposes your model as an AI-callable tool. Any MCP-compatible client — Claude Desktop, Cursor, Continue, Zed — can then invoke your model via natural language.
The generated server:
- Exposes a
run_modeltool with a JSON schema derived from model inputs - Handles image decoding, tensor conversion, and postprocessing
- Ships with a
claude_desktop_config.jsonsnippet ready to copy
Containerization
anydeploy.containerize(model_path) generates:
Dockerfile— minimal base image (python-slim by default) with only the runtime dependencies your model needsrequirements.txt— pinned versions discovered from the export step.dockerignore— sensible defaultsdocker-compose.yml(optional) — for multi-container setups
API Reference
| Function | Purpose |
|---|---|
anydeploy.export(model, format, out, **opts) |
Export to ONNX/TorchScript/TFLite |
anydeploy.serve(model_path, host, port) |
Launch a FastAPI server |
anydeploy.generate_server(model_path, out) |
Generate FastAPI code to disk |
anydeploy.mcp(model_path, out, name) |
Generate an MCP tool server |
anydeploy.containerize(model_path, out) |
Generate Dockerfile + requirements |
anydeploy.quantize(model_path, mode="int8") |
Post-training quantization |
anydeploy.benchmark(model_path) |
Measure latency + throughput |
CLI Usage
# Export
anydeploy export model.pt --format onnx --out model.onnx --profile edge
# Serve
anydeploy serve model.onnx --port 8000
# Generate MCP server
anydeploy mcp model.onnx --out mcp_server/ --name my-model
# Containerize
anydeploy containerize model.onnx --out docker/
# Benchmark
anydeploy benchmark model.onnx --runs 100
Examples
Train with traincv, deploy with anydeploy
import traincv, anydeploy
# Train a YOLOv8 detector
run = traincv.train("datasets/pets/", task="detect", model="yolov8n", epochs=50)
# Export to ONNX, edge-quantized
anydeploy.export(run.weights_path, format="onnx",
out="pets.onnx", profile="edge")
# Expose as an MCP tool for Claude Desktop
anydeploy.mcp("pets.onnx", out="pets_mcp/", name="pet-detector")
Auto-generate a Docker image and run it
import anydeploy
anydeploy.containerize("model.onnx", out="deploy/")
# Then:
# cd deploy && docker build -t my-model .
# docker run -p 8000:8000 my-model
Benchmark before and after quantization
import anydeploy
print(anydeploy.benchmark("model.onnx")) # fp32 baseline
anydeploy.quantize("model.onnx", mode="int8", out="model_int8.onnx")
print(anydeploy.benchmark("model_int8.onnx")) # int8 quantized
License
MIT (c) Viet-Anh Nguyen
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anydeploy-0.2.3.tar.gz.
File metadata
- Download URL: anydeploy-0.2.3.tar.gz
- Upload date:
- Size: 38.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a3b0814698050350f8a7996609e7b8fe3c269c31b57c349318c82a0e73a1299
|
|
| MD5 |
16930018e36dbeb314763a0d18a9a3a8
|
|
| BLAKE2b-256 |
3df2dba36b899e27b46c5e71d8cb3087329d46821fb2819a9a015a8dc8681fa7
|
File details
Details for the file anydeploy-0.2.3-py3-none-any.whl.
File metadata
- Download URL: anydeploy-0.2.3-py3-none-any.whl
- Upload date:
- Size: 36.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
519c3b7a2988fcdd9e3c3c5e70777286117c2fbaf6b35fc147d4e9209727a68a
|
|
| MD5 |
2e0084c17cd707ae784cc55730eb1722
|
|
| BLAKE2b-256 |
55687230b22dd93eded4122610ddc0fee57e48b6df067caa678ae39b4ae9cef6
|