Skip to main content

Nitrous Oxide for your AI Infrastructure.

Project description

Nitrous Oxide for your AI Infrastructure

PyPI Version PyPI Version PyPI Downloads PyPi Downloads
Discord PyPi Version

Website | Docs | Discord

โšก๏ธ What is NOS?

NOS (torch-nos) is a fast and flexible Pytorch inference server, specifically designed for optimizing and running inference of popular foundational AI models.

  • ๐Ÿ‘ฉโ€๐Ÿ’ป Easy-to-use: Built for PyTorch and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.
  • ๐Ÿฅท Flexible: Run and serve several foundational AI models (Stable Diffusion, CLIP, Whisper) in a single place.
  • ๐Ÿ”Œ Pluggable: Plug your front-end to NOS with out-of-the-box high-performance gRPC/REST APIs, avoiding all kinds of ML model deployment hassles.
  • ๐Ÿš€ Scalable: Optimize and scale models easily for maximum HW performance without a PhD in ML, distributed systems or infrastructure.
  • ๐Ÿ“ฆ Extensible: Easily hack and add custom models, optimizations, and HW-support in a Python-first environment.
  • โš™๏ธ HW-accelerated: Take full advantage of your underlying HW (GPUs, ASICs) without compromise.
  • โ˜๏ธ Cloud-agnostic: Run on any cloud HW (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.

NOS inherits its name from Nitrous Oxide System, the performance-enhancing system typically used in racing cars. NOS is designed to be modular and easy to extend.

๐Ÿš€ Getting Started

Get started with the full NOS server by installing via pip:

$ conda env create -n nos-py38 python=3.8
$ conda activate nos-py38
$ conda install pytorch>=2.0.1 torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
$ pip install torch-nos[server]

If you want to simply use a light-weight NOS client and run inference on your local machine (via docker), you can install the client-only package:

$ conda env create -n nos-py38 python=3.8
$ conda activate nos-py38
$ pip install torch-nos

๐Ÿ”ฅ Quickstart / Show me the code

Image Generation as-a-Service

gRPC API โšก REST API
from nos.client import Client

client = Client("[::]:50051")

sdxl = client.Module("stabilityai/stable-diffusion-xl-base-1-0")
image, = sdxl(prompts=["fox jumped over the moon"],
              width=1024, height=1024, num_images=1)
curl \
-X POST http://localhost:8000/infer \
-H 'Content-Type: application/json' \
-d '{
      "model_id": "stabilityai/stable-diffusion-xl-base-1-0",
      "inputs": {
          "prompts": ["fox jumped over the moon"],
          "width": 1024,
          "height": 1024,
          "num_images": 1
      }
    }'

Text & Image Embedding-as-a-Service (CLIP-as-a-Service)

gRPC API โšก REST API
from nos.client import Client

client = Client("[::]:50051")

clip = client.Module("openai/clip")
txt_vec = clip.encode_text(text=["fox jumped over the moon"])
curl \
-X POST http://localhost:8000/infer \
-H 'Content-Type: application/json' \
-d '{
      "model_id": "openai/clip",
      "method": "encode_text",
      "inputs": {
          "texts": ["fox jumped over the moon"]
      }
    }'

๐Ÿ“‚ Directory Structure

โ”œโ”€โ”€ docker         # Dockerfile for CPU/GPU servers
โ”œโ”€โ”€ docs           # mkdocs documentation
โ”œโ”€โ”€ examples       # example guides, jupyter notebooks, demos
โ”œโ”€โ”€ makefiles      # makefiles for building/testing
โ”œโ”€โ”€ nos
โ”‚ย ย  โ”œโ”€โ”€ cli        # CLI (hub, system)
โ”‚ย ย  โ”œโ”€โ”€ client     # gRPC / REST client
โ”‚ย ย  โ”œโ”€โ”€ common     # common utilities
โ”‚ย ย  โ”œโ”€โ”€ executors  # runtime executor (i.e. Ray)
โ”‚ย ย  โ”œโ”€โ”€ hub        # hub utilies
โ”‚ย ย  โ”œโ”€โ”€ managers   # model manager / multiplexer
โ”‚ย ย  โ”œโ”€โ”€ models     # model zoo
โ”‚ย ย  โ”œโ”€โ”€ proto      # protobuf defs for NOS gRPC service
โ”‚ย ย  โ”œโ”€โ”€ server     # server backend (gRPC)
โ”‚ย ย  โ””โ”€โ”€ test       # pytest utilities
โ”œโ”€โ”€ requirements   # requirement extras (server, docs, tests)
โ”œโ”€โ”€ scripts        # basic scripts
โ””โ”€โ”€ tests          # pytests (client, server, benchmark)

๐Ÿ“š Documentation

๐Ÿ›ฃ Roadmap

HW / Cloud Support

  • Commodity GPUs

    • NVIDIA GPUs (20XX, 30XX, 40XX)
    • AMD GPUs (RX 7000)
  • Cloud GPUs

    • NVIDIA (H100, A100, A10G, A30G, T4, L4)
    • AMD (MI200, MI250)
  • Cloud Service Providers (via SkyPilot)

    • AWS, GCP, Azure
    • Opinionated Cloud: Lambda Labs, RunPod, etc
  • Cloud ASICs

๐Ÿ“„ License

This project is licensed under the Apache-2.0 License.

๐Ÿ“ก Telemetry

NOS collects anonymous usage data using Sentry. This is used to help us understand how the community is using NOS and to help us prioritize features. You can opt-out of telemetry by setting NOS_TELEMETRY_ENABLED=0.

๐Ÿค Contributing

We welcome contributions! Please see our contributing guide for more information.

๐Ÿ”— Quick Links


<style> .md-typeset h1, .md-content__button { display: none; } </style>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

torch_nos-0.1.0rc2-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file torch_nos-0.1.0rc2-py3-none-any.whl.

File metadata

  • Download URL: torch_nos-0.1.0rc2-py3-none-any.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for torch_nos-0.1.0rc2-py3-none-any.whl
Algorithm Hash digest
SHA256 a3f845c0addaf1c6780e556ad2cc738c9b63e060db23754d70cf11bc5434dfcc
MD5 3a75d76cc5899a330848f265b631aba7
BLAKE2b-256 bada941a292495c106bb1cb57b71449127a421cef28968f37884001f0fd8f629

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page