Nitrous Oxide for your AI Infrastructure.
Project description
Website | Docs | Blog | Discord
What is NOS?
NOS (torch-nos
) is a fast and flexible Pytorch inference server, specifically designed for optimizing and running inference of popular foundational AI models.
Why use NOS?
- 👩💻 Easy-to-use: Built for PyTorch and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.
- 🥷 Flexible: Run and serve several foundational AI models (Stable Diffusion, CLIP, Whisper) in a single place.
- 🔌 Pluggable: Plug your front-end to NOS with out-of-the-box high-performance gRPC/REST APIs, avoiding all kinds of ML model deployment hassles.
- 🚀 Scalable: Optimize and scale models easily for maximum HW performance without a PhD in ML, distributed systems or infrastructure.
- 📦 Extensible: Easily hack and add custom models, optimizations, and HW-support in a Python-first environment.
- ⚙️ HW-accelerated: Take full advantage of your underlying HW (GPUs, ASICs) without compromise.
- ☁️ Cloud-agnostic: Run on any cloud HW (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.
NOS inherits its name from Nitrous Oxide System, the performance-enhancing system typically used in racing cars. NOS is designed to be modular and easy to extend.
What can NOS do?
💬 Chat / LLM Agents (ChatGPT-as-a-Service)
NOS provides an OpenAI-compatible server with streaming support so that you can connect your favorite LLM client.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("meta-llama/Llama-2-7b-chat-hf")
response = model.chat(message="Tell me a story of 1000 words with emojis", _stream=True)
|
curl \
-X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Llama-2-7b-chat-hf",
"messages": [{"role": "user", "content": "Tell me a story of 1000 words with emojis"}],
"temperature": 0.7, "stream": true
}'
|
🏞️ Image Generation (Stable-Diffusion-as-a-Service)
Build MidJourney discord bots in seconds.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
sdxl = client.Module("stabilityai/stable-diffusion-xl-base-1-0")
image, = sdxl(prompts=["hippo with glasses in a library, cartoon styling"],
width=1024, height=1024, num_images=1)
|
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"model_id": "stabilityai/stable-diffusion-xl-base-1-0",
"inputs": {
"prompts": ["hippo with glasses in a library, cartoon styling"],
"width": 1024,
"height": 1024,
"num_images": 1
}
}'
|
🧠 Text & Image Embedding (CLIP-as-a-Service)
Build scalable semantic search of images/videos in minutes.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
clip = client.Module("openai/clip-vit-base-patch32")
txt_vec = clip.encode_text(texts=["fox jumped over the moon"])
|
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"model_id": "openai/clip-vit-base-patch32",
"method": "encode_text",
"inputs": {
"texts": ["fox jumped over the moon"]
}
}'
|
🎙️ Audio Transcription (Whisper-as-a-Service)
Perform real-time audio transcription using Whisper.
gRPC API ⚡ | REST API |
from pathlib import Path
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("openai/whisper-small.en")
with client.UploadFile(Path("audio.wav")) as remote_path:
response = model(path=remote_path)
# {"chunks": ...}
|
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=openai/whisper-small.en' \
-F 'file=@audio.wav'
|
🧐 Object Detection (YOLOX-as-a-Service)
Run classical computer-vision tasks in 2 lines of code.
gRPC API ⚡ | REST API |
from pathlib import Path
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("yolox/medium")
response = model(images=[Image.open("image.jpg")])
|
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=yolox/medium' \
-F 'file=@image.jpg'
|
⚒️ Custom models
Want to run models not supported by NOS? You can easily add your own models following the examples in the NOS Playground.
📚 Documentation
- Tutorials
- Quickstart
- Models
- Concepts: Architecture Overview, ModelSpec, ModelManager, Runtime Environments
- Demos: Building a Discord Image Generation Bot, Video Search Demo
📄 License
This project is licensed under the Apache-2.0 License.
📡 Telemetry
NOS collects anonymous usage data using Sentry. This is used to help us understand how the community is using NOS and to help us prioritize features. You can opt-out of telemetry by setting NOS_TELEMETRY_ENABLED=0
.
🤝 Contributing
We welcome contributions! Please see our contributing guide for more information.
🔗 Quick Links
- 💬 Send us an email at support@autonomi.ai or join our Discord for help.
- 📣 Follow us on Twitter, and LinkedIn to keep up-to-date on our products.
<style> .md-typeset h1, .md-content__button { display: none; } </style>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file torch_nos-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: torch_nos-0.1.5-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05b45b1d62c96e248d0e09d083d72e60ab0e4ae896a5eb55e246fb5f35e28d4b |
|
MD5 | 504e557a367afa82fe222dca604eb03f |
|
BLAKE2b-256 | bb5fbab2a229110bb1c0c9e35741030addcf5b93e2a20c82f446caf14a391e89 |