A library to package, ship and deploy your ML app
Project description
Modalkit
A powerful Python framework for deploying ML models on Modal with production-ready features
🎯 What Modalkit Offers Over Raw Modal
While Modal provides excellent serverless infrastructure, Modalkit adds a complete ML deployment framework:
🏗️ Standardized ML Architecture
- Structured Inference Pipeline: Enforced
preprocess()→predict()→postprocess()pattern - Consistent API Endpoints:
/predict_sync,/predict_batch,/predict_asyncacross all deployments - Type-Safe Interfaces: Pydantic models ensure data validation at API boundaries
⚙️ Configuration-Driven Deployments
- YAML Configuration: Version-controlled deployment settings instead of scattered code
- Environment Management: Easy dev/staging/prod configs with override capabilities
- Reproducible Builds: Declarative infrastructure removes deployment inconsistencies
👥 Team-Friendly Workflows
- Shared Standards: All team members deploy models the same way
- Code Separation: Model logic decoupled from Modal deployment boilerplate
- Collaboration: Config files in git enable infrastructure review and collaboration
🚀 Production Features Out-of-the-Box
- Authentication Middleware: Built-in API key or Modal proxy auth
- Queue Integration: Async processing with multiple backend support
- Cloud Storage: Direct S3/GCS/R2 mounting without manual setup
- Batch Processing: Intelligent request batching for GPU efficiency
- Error Handling: Comprehensive error responses and logging
💡 Developer Experience
- Less Boilerplate: Focus on model code, not FastAPI/Modal setup
- Modern Tooling: Pre-configured with ruff, mypy, pre-commit hooks
- Testing Framework: Built-in patterns for testing ML deployments
In short: Modalkit transforms Modal from infrastructure primitives into a complete ML platform, letting teams deploy models consistently while maintaining Modal's performance and scalability.
✨ Key Features
- 🚀 Native Modal Integration: Seamless deployment on Modal's serverless infrastructure
- 🔐 Flexible Authentication: Modal proxy auth or custom API keys with AWS SSM support
- ☁️ Cloud Storage Support: Direct mounting of S3, GCS, and R2 buckets
- 🔄 Queue Integration: Built-in support for SQS and Taskiq for async workflows
- 📦 Batch Inference: Efficient batch processing with configurable batch sizes
- 🎯 Type Safety: Full Pydantic integration for request/response validation
- 🛠️ Developer Friendly: Pre-configured with modern Python tooling (ruff, pre-commit)
- 📊 Production Ready: Comprehensive error handling and logging
🚀 Quick Start
Installation
# Using pip
pip install git+https://github.com/prassanna-ravishankar/modalkit.git
# Using uv (recommended)
uv pip install git+https://github.com/prassanna-ravishankar/modalkit.git
1. Define Your Model
Create an inference class that inherits from InferencePipeline:
from modalkit.inference import InferencePipeline
from pydantic import BaseModel
from typing import List
# Define input/output schemas with Pydantic
class TextInput(BaseModel):
text: str
language: str = "en"
class TextOutput(BaseModel):
translated_text: str
confidence: float
# Implement your model logic
class TranslationModel(InferencePipeline):
def __init__(self, model_name: str, all_model_data_folder: str, common_settings: dict, *args, **kwargs):
super().__init__(model_name, all_model_data_folder, common_settings)
# Load your model here
# self.model = load_model(...)
def preprocess(self, input_list: List[TextInput]) -> dict:
"""Prepare inputs for the model"""
texts = [item.text for item in input_list]
return {"texts": texts, "languages": [item.language for item in input_list]}
def predict(self, input_list: List[TextInput], preprocessed_data: dict) -> dict:
"""Run model inference"""
# Your model prediction logic
translations = [text.upper() for text in preprocessed_data["texts"]] # Example
return {"translations": translations, "scores": [0.95] * len(translations)}
def postprocess(self, input_list: List[TextInput], raw_output: dict) -> List[TextOutput]:
"""Format model outputs"""
return [
TextOutput(translated_text=text, confidence=score)
for text, score in zip(raw_output["translations"], raw_output["scores"])
]
2. Create Your Modal App
import modal
from modalkit.modalapp import ModalService, create_web_endpoints
from modalkit.modalutils import ModalConfig
# Initialize with your config
modal_config = ModalConfig()
app = modal.App(name=modal_config.app_name)
# Define your Modal app class
@app.cls(**modal_config.get_app_cls_settings())
class TranslationApp(ModalService):
inference_implementation = TranslationModel
model_name: str = modal.parameter(default="translation_model")
modal_utils: ModalConfig = modal_config
# Create API endpoints
@app.function(**modal_config.get_handler_settings())
@modal.asgi_app(**modal_config.get_asgi_app_settings())
def web_endpoints():
return create_web_endpoints(
app_cls=TranslationApp,
input_model=TextInput,
output_model=TextOutput
)
3. Configure Your Deployment
Create a modalkit.yaml configuration file:
# modalkit.yaml
app_settings:
app_prefix: "translation-service"
# Authentication configuration
auth_config:
# Option 1: Use API key from AWS SSM
ssm_key: "/translation/api-key"
auth_header: "x-api-key"
# Option 2: Use hardcoded API key (not recommended for production)
# api_key: "your-api-key-here"
# auth_header: "x-api-key"
# Container configuration
build_config:
image: "python:3.11-slim" # or your custom image
tag: "latest"
workdir: "/app"
env:
MODEL_VERSION: "v1.0"
# Deployment settings
deployment_config:
gpu: "T4" # Options: T4, A10G, A100, or null for CPU
concurrency_limit: 10
container_idle_timeout: 300
secure: false # Set to true for Modal proxy auth
# Cloud storage mounts (optional)
cloud_bucket_mounts:
- mount_point: "/mnt/models"
bucket_name: "my-model-bucket"
secret: "aws-credentials"
read_only: true
key_prefix: "models/"
# Batch processing settings
batch_config:
max_batch_size: 32
wait_ms: 100 # Wait up to 100ms to fill batch
# Queue configuration (for async endpoints)
queue_config:
backend: "taskiq" # or "sqs" for AWS SQS
broker_url: "redis://localhost:6379"
# Model configuration
model_settings:
local_model_repository_folder: "./models"
common:
cache_dir: "./cache"
device: "cuda" # or "cpu"
model_entries:
translation_model:
model_path: "path/to/model.pt"
vocab_size: 50000
4. Deploy to Modal
# Test locally
modal serve app.py
# Deploy to production
modal deploy app.py
# View logs
modal logs -f
5. Use Your API
import requests
import asyncio
# For standard API key auth
headers = {"x-api-key": "your-api-key"}
# Synchronous endpoint
response = requests.post(
"https://your-org--translation-service.modal.run/predict_sync",
json={"text": "Hello world", "language": "en"},
headers=headers
)
print(response.json())
# {"translated_text": "HELLO WORLD", "confidence": 0.95}
# Asynchronous endpoint (returns immediately)
response = requests.post(
"https://your-org--translation-service.modal.run/predict_async",
json={"text": "Hello world", "language": "en"},
headers=headers
)
print(response.json())
# {"message_id": "550e8400-e29b-41d4-a716-446655440000"}
# Batch endpoint
response = requests.post(
"https://your-org--translation-service.modal.run/predict_batch",
json=[
{"text": "Hello", "language": "en"},
{"text": "World", "language": "en"}
],
headers=headers
)
print(response.json())
# [{"translated_text": "HELLO", "confidence": 0.95}, {"translated_text": "WORLD", "confidence": 0.95}]
🔐 Authentication
Modalkit provides flexible authentication options:
Option 1: Custom API Key (Default)
Configure with secure: false in your deployment config.
# modalkit.yaml
deployment_config:
secure: false
auth_config:
# Store in AWS SSM (recommended)
ssm_key: "/myapp/api-key"
# OR hardcode (not recommended)
# api_key: "sk-1234567890"
auth_header: "x-api-key"
# Client usage
headers = {"x-api-key": "your-api-key"}
response = requests.post(url, json=data, headers=headers)
Option 2: Modal Proxy Authentication
Configure with secure: true for Modal's built-in auth:
# modalkit.yaml
deployment_config:
secure: true # Enables Modal proxy auth
# Client usage
headers = {
"Modal-Key": "your-modal-key",
"Modal-Secret": "your-modal-secret"
}
response = requests.post(url, json=data, headers=headers)
💡 Tip: Modal proxy auth is recommended for production as it's managed by Modal and requires no additional setup.
⚙️ Configuration
Configuration Structure
Modalkit uses YAML configuration with two main sections:
# modalkit.yaml
app_settings: # Application deployment settings
app_prefix: str # Prefix for your Modal app name
auth_config: # Authentication configuration
build_config: # Container build settings
deployment_config: # Runtime deployment settings
batch_config: # Batch processing settings
queue_config: # Async queue settings
model_settings: # Model-specific settings
local_model_repository_folder: str
common: dict # Shared settings across models
model_entries: # Model-specific configurations
model_name: dict
Environment Variables
Set configuration file location:
# Default location
export MODALKIT_CONFIG="modalkit.yaml"
# Multiple configs (later files override earlier ones)
export MODALKIT_CONFIG="base.yaml,prod.yaml"
# Other environment variables
export MODALKIT_APP_POSTFIX="-prod" # Appended to app name
Advanced Configuration Options
deployment_config:
# GPU configuration
gpu: "T4" # T4, A10G, A100, H100, or null
# Resource limits
concurrency_limit: 10
container_idle_timeout: 300
retries: 3
# Memory/CPU (when gpu is null)
memory: 8192 # MB
cpu: 4.0 # cores
# Volumes and mounts
volumes:
"/mnt/cache": "model-cache-vol"
mounts:
- local_path: "configs/prod.json"
remote_path: "/app/config.json"
type: "file"
☁️ Cloud Storage Integration
Modalkit seamlessly integrates with cloud storage providers through Modal's CloudBucketMount:
Supported Providers
| Provider | Configuration |
|---|---|
| AWS S3 | Native support with IAM credentials |
| Google Cloud Storage | Service account authentication |
| Cloudflare R2 | S3-compatible API |
| MinIO/Others | Any S3-compatible endpoint |
Quick Examples
AWS S3 Configuration
cloud_bucket_mounts:
- mount_point: "/mnt/models"
bucket_name: "my-ml-models"
secret: "aws-credentials" # Modal secret name
key_prefix: "production/" # Only mount this prefix
read_only: true
First, create the Modal secret:
modal secret create aws-credentials \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
AWS_DEFAULT_REGION=us-east-1
Google Cloud Storage
cloud_bucket_mounts:
- mount_point: "/mnt/datasets"
bucket_name: "my-datasets"
bucket_endpoint_url: "https://storage.googleapis.com"
secret: "gcp-credentials"
Create secret from service account:
modal secret create gcp-credentials \
--from-gcp-service-account path/to/key.json
Cloudflare R2
cloud_bucket_mounts:
- mount_point: "/mnt/artifacts"
bucket_name: "ml-artifacts"
bucket_endpoint_url: "https://accountid.r2.cloudflarestorage.com"
secret: "r2-credentials"
Using Mounted Storage
class MyInference(InferencePipeline):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# Load model from mounted bucket
model_path = "/mnt/models/my_model.pt"
self.model = torch.load(model_path)
# Load dataset
with open("/mnt/datasets/vocab.json") as f:
self.vocab = json.load(f)
Best Practices
- ✅ Use read-only mounts for model artifacts
- ✅ Mount only required prefixes with
key_prefix - ✅ Use separate buckets for models vs. data
- ✅ Cache frequently accessed files locally
- ❌ Avoid writing logs to mounted buckets
- ❌ Don't mount entire buckets if you only need specific files
🚀 Advanced Features
Async Queue Processing
Modalkit supports async processing with multiple queue backends:
queue_config:
backend: "taskiq" # or "sqs"
broker_url: "redis://redis:6379"
# Async endpoint returns immediately
response = requests.post("/predict_async", json=data)
# {"message_id": "uuid", "status": "queued"}
Batch Processing
Configure intelligent batching for better GPU utilization:
batch_config:
max_batch_size: 32
wait_ms: 100 # Max time to wait for batch to fill
Volume Reloading
Auto-reload Modal volumes for model updates:
deployment_config:
volumes:
"/mnt/models": "model-volume"
volume_reload_interval_seconds: 300 # Reload every 5 minutes
🛠️ Development
Setup
# Clone repository
git clone https://github.com/prassanna-ravishankar/modalkit.git
cd modalkit
# Install with uv (recommended)
uv sync
# Install pre-commit hooks
uv run pre-commit install
Testing
# Run all tests
uv run pytest --cov --cov-config=pyproject.toml --cov-report=xml
# Run specific tests
uv run pytest tests/test_modal_service.py -v
# Run with HTML coverage report
uv run pytest --cov=modalkit --cov-report=html
Code Quality
# Run all checks
uv run pre-commit run -a
# Run type checking
uv run mypy modalkit/
# Format code
uv run ruff format modalkit/ tests/
# Lint code
uv run ruff check modalkit/ tests/
📖 API Reference
Endpoints
| Endpoint | Method | Description | Returns |
|---|---|---|---|
/predict_sync |
POST | Synchronous inference | Model output |
/predict_async |
POST | Async inference (queued) | Message ID |
/predict_batch |
POST | Batch inference | List of outputs |
/health |
GET | Health check | Status |
InferencePipeline Methods
Your model class must implement:
def preprocess(self, input_list: List[InputModel]) -> dict
def predict(self, input_list: List[InputModel], preprocessed_data: dict) -> dict
def postprocess(self, input_list: List[InputModel], raw_output: dict) -> List[OutputModel]
🤝 Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
Development Workflow
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests and linting (
uv run pytest && uv run pre-commit run -a) - Commit your changes (pre-commit hooks will run automatically)
- Push to your fork and open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
Built with ❤️ using:
- Modal - Serverless infrastructure for ML
- FastAPI - Modern web framework
- Pydantic - Data validation
- Taskiq - Async task processing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modalkit-0.2.0.tar.gz.
File metadata
- Download URL: modalkit-0.2.0.tar.gz
- Upload date:
- Size: 43.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4760bdf8968df0a811c40ac2c013564cbeaafd2852ff697cc114b480ffe0060a
|
|
| MD5 |
c20e830281cf5c8ab09421b34a86c087
|
|
| BLAKE2b-256 |
ad7627c93fba9f1aac36e13ec10a2f55559727ac429811a06073e8b598b9bdec
|
File details
Details for the file modalkit-0.2.0-py3-none-any.whl.
File metadata
- Download URL: modalkit-0.2.0-py3-none-any.whl
- Upload date:
- Size: 24.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0e52b40802e080eab411f047c6c5d359140e07f3845954eb9e72d4672d4b14e
|
|
| MD5 |
57d8f63972a040358e45a6bdb600506e
|
|
| BLAKE2b-256 |
63b3bbc9e11b096bbe2e910bfe4fdec742001d5323a41ce870aece242b302b9e
|