Skip to main content

SDK for nvidia services

Project description

NVIDIA SERVICES

NVIDIA recently announced NVIDIA NIMs which offers optimized inference microservices for deploying AI models at scale. The NIM services along with the NEMO services will allow to develop and deploy RAG based applications quickly in production.

This package is created to be the PYTHON SDK for those services. The idea is to write only few lines of code to develop applications with NVIDIA services. The bolier plate code goes in the SDK

How to use the SDK

The SDK is now pushed to PYPI. To install it run the below command

pip install nvidia-services

There are two services which are now part of the SDK

  1. EMBEDDING
  2. RERANKING

Example code for embedding

import os
from dotenv import load_dotenv
from nvidia_services.embeddings.nvidia_embeddings import NVIDIAEmbeddings
load_dotenv()
NVIDIA_API_KEY = os.getenv("NVIDIA_API_KEY")
nv = NVIDIAEmbeddings(api_key=NVIDIA_API_KEY)
embeddings = nv.create_embed(["hello, how are you","I am fine"])
for embedding in embeddings:
    print(embedding)

Example code for reranker

import os
from dotenv import load_dotenv
from nvidia_services.retrievals.nvidia_reranker import NVDIARerankerMistral
load_dotenv()
NVIDIA_API_KEY = os.getenv("NVIDIA_API_KEY")
nv = NVDIARerankerMistral(api_key=NVIDIA_API_KEY)
passages = [
    {
      "text": "The Hopper GPU is paired with the Grace CPU using NVIDIA's ultra-fast chip-to-chip interconnect, delivering 900GB/s of bandwidth, 7X faster than PCIe Gen5. This innovative design will deliver up to 30X higher aggregate system memory bandwidth to the GPU compared to today's fastest servers and up to 10X higher performance for applications running terabytes of data."
    },
    {
      "text": "A100 provides up to 20X higher performance over the prior generation and can be partitioned into seven GPU instances to dynamically adjust to shifting demands. The A100 80GB debuts the world's fastest memory bandwidth at over 2 terabytes per second (TB/s) to run the largest models and datasets."
    },
    {
      "text": "Accelerated servers with H100 deliver the compute power—along with 3 terabytes per second (TB/s) of memory bandwidth per GPU and scalability with NVLink and NVSwitch™."
    }
  ]
input = "What is the GPU memory bandwidth of H100 SXM?"
result = nv.return_context(input=input,passages=passages)
print(result) 

Example code to call the mistral model

import os

from dotenv import load_dotenv

from nvidia_services.models.mistralai_models import MistralAIModels

load_dotenv()
NVIDIA_API_KEY = os.getenv("NVIDIA_API_KEY")
mistral = MistralAIModels(api_key=NVIDIA_API_KEY)

prompt = "Where is TajMahal?"
result = mistral.generate_response(prompt=prompt)

for chunk in result:
    print(chunk.choices[0].delta.content)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nvidia-services-0.0.2.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

nvidia_services-0.0.2-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file nvidia-services-0.0.2.tar.gz.

File metadata

  • Download URL: nvidia-services-0.0.2.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.7

File hashes

Hashes for nvidia-services-0.0.2.tar.gz
Algorithm Hash digest
SHA256 19edae8e7ca80c23d799ca3af1b7c724263bfe45feb69bf29c742a72a772345e
MD5 6aa2f340b45c76149520c3c8db3bc71d
BLAKE2b-256 d09c2dec15065c8919dbbf2c2fb0cb6dc25cbebcbcd7b74011b5643217ad1b5f

See more details on using hashes here.

File details

Details for the file nvidia_services-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for nvidia_services-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0571c448f3e8cb2337d850869dc8059b6702819c13804a7409483953e2d81fd8
MD5 b74ca1bcced40785f7f0b40f889fc022
BLAKE2b-256 b6c34930957e34feea269f48a44ad08c8867ab721cb7241d926b26cb814f78e0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page