Skip to main content

CLI tool and library to export ML models to production formats and containerize them with Docker

Project description

anydeploy

Export, serve, and containerize any ML model — plus auto-generate MCP servers for AI agents.

PyPI Python License

anydeploy is the last-mile deployment toolkit for ML models. It exports PyTorch or sklearn models to ONNX, TorchScript, or TFLite with smart defaults; generates a FastAPI server with health checks and OpenAPI docs; auto-creates a Model Context Protocol (MCP) server so any AI agent (Claude Desktop, Continue, Cursor) can call your model as a tool; and produces Dockerfiles + requirements files for reproducible deployment. Three deployment profiles (edge, balanced, quality) pick quantization and precision for you.

Built by Viet-Anh Nguyen at NRL.ai.

Why anydeploy?

  • One-liner APIanydeploy.export(model, "onnx") handles shape inference, opset, and validation
  • Plugin architecture — Register custom exporters, servers, or container targets
  • Local-first — Everything runs on your machine; no cloud account needed
  • Minimal core deps — Base install has zero heavy deps; torch/tf are optional
  • Production-ready — MCP integration, FastAPI generation, Dockerfile scaffolding

Installation

pip install anydeploy

For optional features:

pip install anydeploy[onnx]      # ONNX export + onnxruntime verification
pip install anydeploy[torch]     # TorchScript export
pip install anydeploy[tflite]    # TFLite conversion
pip install anydeploy[serve]     # FastAPI + uvicorn server
pip install anydeploy[mcp]       # Model Context Protocol server generation
pip install anydeploy[all]       # everything

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anydeploy
import torch

model = torch.load("resnet50.pt").eval()

# 1. Export to ONNX with smart defaults (opset, dynamic axes, validation)
anydeploy.export(
    model,
    format="onnx",
    out="resnet50.onnx",
    example_input=torch.randn(1, 3, 224, 224),
    profile="balanced",          # edge | balanced | quality
)

# 2. Generate a FastAPI server with health check + OpenAPI docs
anydeploy.serve("resnet50.onnx", host="0.0.0.0", port=8000)

# 3. Generate an MCP server so Claude Desktop / Cursor can call the model
anydeploy.mcp("resnet50.onnx", out="my_mcp_server/", name="image-classifier")

# 4. Generate a Dockerfile + requirements.txt for reproducible deployment
anydeploy.containerize("resnet50.onnx", out="docker/", base="python:3.11-slim")

Models & Methods

Export formats

Format How it works Notes
ONNX torch.onnx.export with auto-derived dynamic axes + opset 17 defaults Validates via onnxruntime after export
TorchScript torch.jit.trace (default) or torch.jit.script Python-free runtime
TFLite torch -> onnx -> tf -> tflite via onnx-tf + TensorFlow converter Mobile / embedded

All exports include automatic shape inference, input/output naming, and a round-trip validation step that runs a dummy input through both the original and the exported model and compares outputs.

Deployment profiles

Profile Precision Quantization Intended target
edge int8 Post-training static quantization Raspberry Pi, phones, MCUs
balanced (default) fp16 Optional fp16 conversion Laptop / workstation CPU
quality fp32 None Server / GPU inference

FastAPI server generation

anydeploy.serve(model_path) generates and launches a FastAPI app with:

  • POST /predict — accepts JSON or multipart image upload
  • GET /health — liveness check
  • GET /docs — interactive OpenAPI UI (Swagger)
  • Automatic request/response Pydantic schemas inferred from the model's input/output shapes
  • Optional batching, CORS, and API-key authentication

MCP (Model Context Protocol) server generation

anydeploy.mcp(model_path, name=...) generates a complete MCP server implementation that exposes your model as an AI-callable tool. Any MCP-compatible client — Claude Desktop, Cursor, Continue, Zed — can then invoke your model via natural language.

The generated server:

  • Exposes a run_model tool with a JSON schema derived from model inputs
  • Handles image decoding, tensor conversion, and postprocessing
  • Ships with a claude_desktop_config.json snippet ready to copy

Containerization

anydeploy.containerize(model_path) generates:

  • Dockerfile — minimal base image (python-slim by default) with only the runtime dependencies your model needs
  • requirements.txt — pinned versions discovered from the export step
  • .dockerignore — sensible defaults
  • docker-compose.yml (optional) — for multi-container setups

API Reference

Function Purpose
anydeploy.export(model, format, out, **opts) Export to ONNX/TorchScript/TFLite
anydeploy.serve(model_path, host, port) Launch a FastAPI server
anydeploy.generate_server(model_path, out) Generate FastAPI code to disk
anydeploy.mcp(model_path, out, name) Generate an MCP tool server
anydeploy.containerize(model_path, out) Generate Dockerfile + requirements
anydeploy.quantize(model_path, mode="int8") Post-training quantization
anydeploy.benchmark(model_path) Measure latency + throughput

CLI Usage

# Export
anydeploy export model.pt --format onnx --out model.onnx --profile edge

# Serve
anydeploy serve model.onnx --port 8000

# Generate MCP server
anydeploy mcp model.onnx --out mcp_server/ --name my-model

# Containerize
anydeploy containerize model.onnx --out docker/

# Benchmark
anydeploy benchmark model.onnx --runs 100

Examples

Train with traincv, deploy with anydeploy

import traincv, anydeploy

# Train a YOLOv8 detector
run = traincv.train("datasets/pets/", task="detect", model="yolov8n", epochs=50)

# Export to ONNX, edge-quantized
anydeploy.export(run.weights_path, format="onnx",
                 out="pets.onnx", profile="edge")

# Expose as an MCP tool for Claude Desktop
anydeploy.mcp("pets.onnx", out="pets_mcp/", name="pet-detector")

Auto-generate a Docker image and run it

import anydeploy

anydeploy.containerize("model.onnx", out="deploy/")

# Then:
#   cd deploy && docker build -t my-model .
#   docker run -p 8000:8000 my-model

Benchmark before and after quantization

import anydeploy

print(anydeploy.benchmark("model.onnx"))              # fp32 baseline
anydeploy.quantize("model.onnx", mode="int8", out="model_int8.onnx")
print(anydeploy.benchmark("model_int8.onnx"))         # int8 quantized

License

MIT (c) Viet-Anh Nguyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anydeploy-0.2.3.tar.gz (38.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anydeploy-0.2.3-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file anydeploy-0.2.3.tar.gz.

File metadata

  • Download URL: anydeploy-0.2.3.tar.gz
  • Upload date:
  • Size: 38.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anydeploy-0.2.3.tar.gz
Algorithm Hash digest
SHA256 1a3b0814698050350f8a7996609e7b8fe3c269c31b57c349318c82a0e73a1299
MD5 16930018e36dbeb314763a0d18a9a3a8
BLAKE2b-256 3df2dba36b899e27b46c5e71d8cb3087329d46821fb2819a9a015a8dc8681fa7

See more details on using hashes here.

File details

Details for the file anydeploy-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: anydeploy-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 36.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anydeploy-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 519c3b7a2988fcdd9e3c3c5e70777286117c2fbaf6b35fc147d4e9209727a68a
MD5 2e0084c17cd707ae784cc55730eb1722
BLAKE2b-256 55687230b22dd93eded4122610ddc0fee57e48b6df067caa678ae39b4ae9cef6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page