Nitrous Oxide for your AI Infrastructure.
Project description
What is NOS?
NOS (torch-nos
) is a fast and flexible Pytorch inference server, specifically designed for optimizing and running inference of popular foundational AI models.
Why use NOS?
- 👩💻 Easy-to-use: Built for PyTorch and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.
- 🥷 Flexible: Run and serve several foundational AI models (Stable Diffusion, CLIP, Whisper) in a single place.
- 🔌 Pluggable: Plug your front-end to NOS with out-of-the-box high-performance gRPC/REST APIs, avoiding all kinds of ML model deployment hassles.
- 🚀 Scalable: Optimize and scale models easily for maximum HW performance without a PhD in ML, distributed systems or infrastructure.
- 📦 Extensible: Easily hack and add custom models, optimizations, and HW-support in a Python-first environment.
- ⚙️ HW-accelerated: Take full advantage of your underlying HW (GPUs, ASICs) without compromise.
- ☁️ Cloud-agnostic: Run on any cloud HW (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.
NOS inherits its name from Nitrous Oxide System, the performance-enhancing system typically used in racing cars. NOS is designed to be modular and easy to extend.
What can NOS do?
💬 Chat / LLM Agents (ChatGPT-as-a-Service)
NOS provides an OpenAI-compatible server with streaming support so that you can connect your favorite LLM client.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("meta-llama/Llama-2-7b-chat-hf")
response = model.chat(message="Tell me a story of 1000 words with emojis")
|
curl \
-X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Llama-2-7b-chat-hf",
"messages": [{"role": "user", "content": "Tell me a story of 1000 words with emojis"}],
"temperature": 0.7, "stream": true
}'
|
🏞️ Image Generation (Stable-Diffusion-as-a-Service)
Build MidJourney discord bots in seconds.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
sdxl = client.Module("stabilityai/stable-diffusion-xl-base-1-0")
image, = sdxl(prompts=["hippo with glasses in a library, cartoon styling"],
width=1024, height=1024, num_images=1)
|
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"model_id": "stabilityai/stable-diffusion-xl-base-1-0",
"inputs": {
"prompts": ["hippo with glasses in a library, cartoon styling"],
"width": 1024,
"height": 1024,
"num_images": 1
}
}'
|
🧠 Text & Image Embedding (CLIP-as-a-Service)
Build scalable semantic search of images/videos in minutes.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
clip = client.Module("openai/clip-vit-base-patch32")
txt_vec = clip.encode_text(texts=["fox jumped over the moon"])
|
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"model_id": "openai/clip-vit-base-patch32",
"method": "encode_text",
"inputs": {
"texts": ["fox jumped over the moon"]
}
}'
|
🎙️ Audio Transcription (Whisper-as-a-Service)
Perform real-time audio transcription using Whisper.
Preview | gRPC API ⚡ | REST API |
from pathlib import Path
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("openai/whisper-large-v2")
with client.UploadFile(Path("audio.wav")) as remote_path:
response = model(path=remote_path)
# {"chunks": ...}
|
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=openai/whisper-large-v2' \
-F 'file=@audio.wav'
|
🧐 Object Detection (YOLOX-as-a-Service)
Run classical computer-vision tasks in 2 lines of code.
gRPC API ⚡ | REST API |
from pathlib import Path
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("yolox/medium")
response = model(images=[Image.open("image.jpg")])
|
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=yolox/medium' \
-F 'file=@image.jpg'
|
⚒️ Custom models
Want to run models not supported by NOS? You can easily add your own models following the examples in the NOS Playground.
Text to video
model_id: str = "animate-diff"
Image to video
model_id: str = "stable-video-diffusion"
Text to 360-view images
model_id: str = "mv-dream"
📚 Documentation
- Quickstart
- Models
- Concepts: Architecture Overview, ModelSpec, ModelManager, Runtime Environments
- Demos: Building a Discord Image Generation Bot, Video Search Demo
📄 License
This project is licensed under the Apache-2.0 License.
🤝 Contributing
We welcome contributions! Please see our contributing guide for more information.
🔗 Quick Links
- 💬 Send us an email at support@autonomi.ai or join our Discord for help.
- 📣 Follow us on Twitter, and LinkedIn to keep up-to-date on our products.
<style> .md-typeset h1, .md-content__button { display: none; } </style>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for torch_nos-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b55b130e9e08302b5fab1aedfaaf2350cf9a4d18063f99c04b8814c24ef56ee |
|
MD5 | 064f9443c9787b491b353d49b8d9774a |
|
BLAKE2b-256 | eb179fc0bcf42f9fd2017fe01cd9cbf8e5ad57b5b963c08cb068c15fb12417f8 |