Nitrous Oxide for your AI Infrastructure.
Project description
Website | Docs | Blog | Discord
What is NOS?
NOS (torch-nos
) is a fast and flexible Pytorch inference server, specifically designed for optimizing and running inference of popular foundational AI models.
Why use NOS?
- 👩💻 Easy-to-use: Built for PyTorch and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.
- 🥷 Flexible: Run and serve several foundational AI models (Stable Diffusion, CLIP, Whisper) in a single place.
- 🔌 Pluggable: Plug your front-end to NOS with out-of-the-box high-performance gRPC/REST APIs, avoiding all kinds of ML model deployment hassles.
- 🚀 Scalable: Optimize and scale models easily for maximum HW performance without a PhD in ML, distributed systems or infrastructure.
- 📦 Extensible: Easily hack and add custom models, optimizations, and HW-support in a Python-first environment.
- ⚙️ HW-accelerated: Take full advantage of your underlying HW (GPUs, ASICs) without compromise.
- ☁️ Cloud-agnostic: Run on any cloud HW (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.
NOS inherits its name from Nitrous Oxide System, the performance-enhancing system typically used in racing cars. NOS is designed to be modular and easy to extend.
What can NOS do?
💬 Chat / LLM Agents (ChatGPT-as-a-Service)
NOS provides an OpenAI-compatible server with streaming support so that you can connect your favorite LLM client.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("meta-llama/Llama-2-7b-chat-hf")
response = model.chat(message="Tell me a story of 1000 words with emojis", _stream=True)
|
curl \
-X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Llama-2-7b-chat-hf",
"messages": [{"role": "user", "content": "Tell me a story of 1000 words with emojis"}],
"temperature": 0.7, "stream": true
}'
|
🏞️ Image Generation (Stable-Diffusion-as-a-Service)
Build MidJourney discord bots in seconds.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
sdxl = client.Module("stabilityai/stable-diffusion-xl-base-1-0")
image, = sdxl(prompts=["hippo with glasses in a library, cartoon styling"],
width=1024, height=1024, num_images=1)
|
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"model_id": "stabilityai/stable-diffusion-xl-base-1-0",
"inputs": {
"prompts": ["hippo with glasses in a library, cartoon styling"],
"width": 1024,
"height": 1024,
"num_images": 1
}
}'
|
🧠 Text & Image Embedding (CLIP-as-a-Service)
Build scalable semantic search of images/videos in minutes.
gRPC API ⚡ | REST API |
from nos.client import Client
client = Client("[::]:50051")
clip = client.Module("openai/clip-vit-base-patch32")
txt_vec = clip.encode_text(texts=["fox jumped over the moon"])
|
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
"model_id": "openai/clip-vit-base-patch32",
"method": "encode_text",
"inputs": {
"texts": ["fox jumped over the moon"]
}
}'
|
🎙️ Audio Transcription (Whisper-as-a-Service)
Perform real-time audio transcription using Whisper.
gRPC API ⚡ | REST API |
from pathlib import Path
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("openai/whisper-small.en")
with client.UploadFile(Path("audio.wav")) as remote_path:
response = model(path=remote_path)
# {"chunks": ...}
|
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=openai/whisper-small.en' \
-F 'file=@audio.wav'
|
🧐 Object Detection (YOLOX-as-a-Service)
Run classical computer-vision tasks in 2 lines of code.
gRPC API ⚡ | REST API |
from pathlib import Path
from nos.client import Client
client = Client("[::]:50051")
model = client.Module("yolox/medium")
response = model(images=[Image.open("image.jpg")])
|
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=yolox/medium' \
-F 'file=@image.jpg'
|
⚒️ Custom models
Want to run models not supported by NOS? You can easily add your own models following the examples in the NOS Playground.
📚 Documentation
- Tutorials
- Quickstart
- Models
- Concepts: Architecture Overview, ModelSpec, ModelManager, Runtime Environments
- Demos: Building a Discord Image Generation Bot, Video Search Demo
📄 License
This project is licensed under the Apache-2.0 License.
📡 Telemetry
NOS collects anonymous usage data using Sentry. This is used to help us understand how the community is using NOS and to help us prioritize features. You can opt-out of telemetry by setting NOS_TELEMETRY_ENABLED=0
.
🤝 Contributing
We welcome contributions! Please see our contributing guide for more information.
🔗 Quick Links
- 💬 Send us an email at support@autonomi.ai or join our Discord for help.
- 📣 Follow us on Twitter, and LinkedIn to keep up-to-date on our products.
<style> .md-typeset h1, .md-content__button { display: none; } </style>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for torch_nos-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05b45b1d62c96e248d0e09d083d72e60ab0e4ae896a5eb55e246fb5f35e28d4b |
|
MD5 | 504e557a367afa82fe222dca604eb03f |
|
BLAKE2b-256 | bb5fbab2a229110bb1c0c9e35741030addcf5b93e2a20c82f446caf14a391e89 |