Skip to main content

CLI tool and library to export ML models to production formats and containerize them with Docker

Project description

anydeploy

Export, serve, and containerize any ML model — plus auto-generate MCP servers for AI agents.

PyPI Python License

anydeploy is the last-mile deployment toolkit for ML models. It exports PyTorch or sklearn models to ONNX, TorchScript, or TFLite with smart defaults; generates a FastAPI server with health checks and OpenAPI docs; auto-creates a Model Context Protocol (MCP) server so any AI agent (Claude Desktop, Continue, Cursor) can call your model as a tool; and produces Dockerfiles + requirements files for reproducible deployment. Three deployment profiles (edge, balanced, quality) pick quantization and precision for you.

Built by Viet-Anh Nguyen at NRL.ai.

Why anydeploy?

  • One-liner APIanydeploy.export(model, "onnx") handles shape inference, opset, and validation
  • Plugin architecture — Register custom exporters, servers, or container targets
  • Local-first — Everything runs on your machine; no cloud account needed
  • Minimal core deps — Base install has zero heavy deps; torch/tf are optional
  • Production-ready — MCP integration, FastAPI generation, Dockerfile scaffolding

Installation

pip install anydeploy

For optional features:

pip install anydeploy[onnx]      # ONNX export + onnxruntime verification
pip install anydeploy[torch]     # TorchScript export
pip install anydeploy[tflite]    # TFLite conversion
pip install anydeploy[serve]     # FastAPI + uvicorn server
pip install anydeploy[mcp]       # Model Context Protocol server generation
pip install anydeploy[all]       # everything

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anydeploy
import torch

model = torch.load("resnet50.pt").eval()

# 1. Export to ONNX with smart defaults (opset, dynamic axes, validation)
anydeploy.export(
    model,
    format="onnx",
    out="resnet50.onnx",
    example_input=torch.randn(1, 3, 224, 224),
    profile="balanced",          # edge | balanced | quality
)

# 2. Generate a FastAPI server with health check + OpenAPI docs
anydeploy.serve("resnet50.onnx", host="0.0.0.0", port=8000)

# 3. Generate an MCP server so Claude Desktop / Cursor can call the model
anydeploy.mcp("resnet50.onnx", out="my_mcp_server/", name="image-classifier")

# 4. Generate a Dockerfile + requirements.txt for reproducible deployment
anydeploy.containerize("resnet50.onnx", out="docker/", base="python:3.11-slim")

Models & Methods

Export formats

Format How it works Notes
ONNX torch.onnx.export with auto-derived dynamic axes + opset 17 defaults Validates via onnxruntime after export
TorchScript torch.jit.trace (default) or torch.jit.script Python-free runtime
TFLite torch -> onnx -> tf -> tflite via onnx-tf + TensorFlow converter Mobile / embedded

All exports include automatic shape inference, input/output naming, and a round-trip validation step that runs a dummy input through both the original and the exported model and compares outputs.

Deployment profiles

Profile Precision Quantization Intended target
edge int8 Post-training static quantization Raspberry Pi, phones, MCUs
balanced (default) fp16 Optional fp16 conversion Laptop / workstation CPU
quality fp32 None Server / GPU inference

FastAPI server generation

anydeploy.serve(model_path) generates and launches a FastAPI app with:

  • POST /predict — accepts JSON or multipart image upload
  • GET /health — liveness check
  • GET /docs — interactive OpenAPI UI (Swagger)
  • Automatic request/response Pydantic schemas inferred from the model's input/output shapes
  • Optional batching, CORS, and API-key authentication

MCP (Model Context Protocol) server generation

anydeploy.mcp(model_path, name=...) generates a complete MCP server implementation that exposes your model as an AI-callable tool. Any MCP-compatible client — Claude Desktop, Cursor, Continue, Zed — can then invoke your model via natural language.

The generated server:

  • Exposes a run_model tool with a JSON schema derived from model inputs
  • Handles image decoding, tensor conversion, and postprocessing
  • Ships with a claude_desktop_config.json snippet ready to copy

Containerization

anydeploy.containerize(model_path) generates:

  • Dockerfile — minimal base image (python-slim by default) with only the runtime dependencies your model needs
  • requirements.txt — pinned versions discovered from the export step
  • .dockerignore — sensible defaults
  • docker-compose.yml (optional) — for multi-container setups

API Reference

Function Purpose
anydeploy.export(model, format, out, **opts) Export to ONNX/TorchScript/TFLite
anydeploy.serve(model_path, host, port) Launch a FastAPI server
anydeploy.generate_server(model_path, out) Generate FastAPI code to disk
anydeploy.mcp(model_path, out, name) Generate an MCP tool server
anydeploy.containerize(model_path, out) Generate Dockerfile + requirements
anydeploy.quantize(model_path, mode="int8") Post-training quantization
anydeploy.benchmark(model_path) Measure latency + throughput

CLI Usage

# Export
anydeploy export model.pt --format onnx --out model.onnx --profile edge

# Serve
anydeploy serve model.onnx --port 8000

# Generate MCP server
anydeploy mcp model.onnx --out mcp_server/ --name my-model

# Containerize
anydeploy containerize model.onnx --out docker/

# Benchmark
anydeploy benchmark model.onnx --runs 100

Examples

Train with traincv, deploy with anydeploy

import traincv, anydeploy

# Train a YOLOv8 detector
run = traincv.train("datasets/pets/", task="detect", model="yolov8n", epochs=50)

# Export to ONNX, edge-quantized
anydeploy.export(run.weights_path, format="onnx",
                 out="pets.onnx", profile="edge")

# Expose as an MCP tool for Claude Desktop
anydeploy.mcp("pets.onnx", out="pets_mcp/", name="pet-detector")

Auto-generate a Docker image and run it

import anydeploy

anydeploy.containerize("model.onnx", out="deploy/")

# Then:
#   cd deploy && docker build -t my-model .
#   docker run -p 8000:8000 my-model

Benchmark before and after quantization

import anydeploy

print(anydeploy.benchmark("model.onnx"))              # fp32 baseline
anydeploy.quantize("model.onnx", mode="int8", out="model_int8.onnx")
print(anydeploy.benchmark("model_int8.onnx"))         # int8 quantized

License

MIT (c) Viet-Anh Nguyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anydeploy-0.2.4.tar.gz (38.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anydeploy-0.2.4-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file anydeploy-0.2.4.tar.gz.

File metadata

  • Download URL: anydeploy-0.2.4.tar.gz
  • Upload date:
  • Size: 38.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anydeploy-0.2.4.tar.gz
Algorithm Hash digest
SHA256 1a9ce3d367e832ea39a5889908df0d7b20856526c45034bb33e0c0cf71d6e445
MD5 ee3f34577b9df8ce91e5ea8933487489
BLAKE2b-256 a180e6ad6b1d034580c74f38c5da9aa6a0ebda8ebb1c8e9f1403cb326618ef4a

See more details on using hashes here.

File details

Details for the file anydeploy-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: anydeploy-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 36.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anydeploy-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e6a4ba9e2fd71186bec0c68bf071e72e8f1c1ef59172aba306a830d633389cfe
MD5 c18f0eb6b67697f080688f258b51e33e
BLAKE2b-256 08e5d359c1b74199037d819c032d3e8a556a2f1fb5105cfad1947943c688c15b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page