Skip to main content

Nitrous Oxide for your AI Infrastructure.

Project description

Nitrous Oxide for your AI Infrastructure

PyPI Version PyPI Version PyPI Downloads PyPi Downloads
Discord PyPi Version

Website | Docs | Discord

What is NOS?

NOS (torch-nos) is a fast and flexible Pytorch inference server, specifically designed for optimizing and running inference of popular foundational AI models.

Why use NOS?

  • 👩‍💻 Easy-to-use: Built for PyTorch and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.
  • 🥷 Flexible: Run and serve several foundational AI models (Stable Diffusion, CLIP, Whisper) in a single place.
  • 🔌 Pluggable: Plug your front-end to NOS with out-of-the-box high-performance gRPC/REST APIs, avoiding all kinds of ML model deployment hassles.
  • 🚀 Scalable: Optimize and scale models easily for maximum HW performance without a PhD in ML, distributed systems or infrastructure.
  • 📦 Extensible: Easily hack and add custom models, optimizations, and HW-support in a Python-first environment.
  • ⚙️ HW-accelerated: Take full advantage of your underlying HW (GPUs, ASICs) without compromise.
  • ☁️ Cloud-agnostic: Run on any cloud HW (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.

NOS inherits its name from Nitrous Oxide System, the performance-enhancing system typically used in racing cars. NOS is designed to be modular and easy to extend.


What can NOS do?

💬 Chat / LLM Agents (ChatGPT-as-a-Service)


NOS provides an OpenAI-compatible server with streaming support so that you can connect your favorite LLM client.

gRPC API ⚡ REST API
from nos.client import Client

client = Client("[::]:50051")

model = client.Module("meta-llama/Llama-2-7b-chat-hf")
response = model.chat(message="Tell me a story of 1000 words with emojis")
curl \
-X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
    "model": "meta-llama/Llama-2-7b-chat-hf",
    "messages": [{"role": "user", "content": "Tell me a story of 1000 words with emojis"}],
    "temperature": 0.7, "stream": true
  }'

🏞️ Image Generation (Stable-Diffusion-as-a-Service)


Build MidJourney discord bots in seconds.

gRPC API ⚡ REST API
from nos.client import Client

client = Client("[::]:50051")

sdxl = client.Module("stabilityai/stable-diffusion-xl-base-1-0")
image, = sdxl(prompts=["hippo with glasses in a library, cartoon styling"],
              width=1024, height=1024, num_images=1)
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
    "model_id": "stabilityai/stable-diffusion-xl-base-1-0",
    "inputs": {
        "prompts": ["hippo with glasses in a library, cartoon styling"],
        "width": 1024,
        "height": 1024,
        "num_images": 1
    }
}'

🧠 Text & Image Embedding (CLIP-as-a-Service)


Build scalable semantic search of images/videos in minutes.

gRPC API ⚡ REST API
from nos.client import Client

client = Client("[::]:50051")

clip = client.Module("openai/clip-vit-base-patch32")
txt_vec = clip.encode_text(texts=["fox jumped over the moon"])
curl \
-X POST http://localhost:8000/v1/infer \
-H 'Content-Type: application/json' \
-d '{
    "model_id": "openai/clip-vit-base-patch32",
    "method": "encode_text",
    "inputs": {
        "texts": ["fox jumped over the moon"]
    }
}'

🎙️ Audio Transcription (Whisper-as-a-Service)


Perform real-time audio transcription using Whisper.

Preview gRPC API ⚡ REST API
from pathlib import Path
from nos.client import Client

client = Client("[::]:50051")

model = client.Module("openai/whisper-large-v2")
with client.UploadFile(Path("audio.wav")) as remote_path:
  response = model(path=remote_path)
# {"chunks": ...}
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=openai/whisper-large-v2' \
-F 'file=@audio.wav'

🧐 Object Detection (YOLOX-as-a-Service)


Run classical computer-vision tasks in 2 lines of code.

gRPC API ⚡ REST API
from pathlib import Path
from nos.client import Client

client = Client("[::]:50051")

model = client.Module("yolox/medium")
response = model(images=[Image.open("image.jpg")])
curl \
-X POST http://localhost:8000/v1/infer/file \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'model_id=yolox/medium' \
-F 'file=@image.jpg'

⚒️ Custom models


Want to run models not supported by NOS? You can easily add your own models following the examples in the NOS Playground.

Text to video

model_id: str = "animate-diff"

Image to video

model_id: str = "stable-video-diffusion"

Text to 360-view images

model_id: str = "mv-dream"

📚 Documentation

📄 License

This project is licensed under the Apache-2.0 License.

🤝 Contributing

We welcome contributions! Please see our contributing guide for more information.

🔗 Quick Links


<style> .md-typeset h1, .md-content__button { display: none; } </style>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

torch_nos-0.1.2-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file torch_nos-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: torch_nos-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for torch_nos-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d89f29804905d2365c7e443799121d4a90da14fed7f6ab91d3fd9034ab0b560c
MD5 15f432a7d774a902f61c6e1b77defc7d
BLAKE2b-256 35ccc73f5d5fb640d8e312d349c9228fc41edc23ada71236232ed9e0557427db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page