Skip to main content

A Python SDK for Inference Gateway

Project description

Inference Gateway Python SDK

An SDK written in Python for the Inference Gateway.

Installation

pip install inference-gateway

Usage

Creating a Client

from inference_gateway.client import InferenceGatewayClient, Provider

client = InferenceGatewayClient("http://localhost:8080")

# With authentication token(optional)
client = InferenceGatewayClient("http://localhost:8080", token="your-token")

Listing Models

To list all available models from all providers, use the list_models method:

models = client.list_models()
print("Available models: ", models)

List Provider's Models

To list available models for a specific provider, use the list_provider_models method:

models = client.list_provider_models(Provider.OPENAI)
print("Available OpenAI models: ", models)

Generating Content

To generate content using a model, use the generate_content method:

from inference_gateway.client import Provider, Role, Message

messages = [
    Message(
      Role.SYSTEM, 
      "You are an helpful assistant"
    ),
    Message(
      Role.USER, 
      "Hello!"
    ),
]

response = client.generate_content(
    provider=Provider.OPENAI,
    model="gpt-4",
    messages=messages
)
print("Assistant: ", response["response"]["content"])

Streaming Content

To stream content using a model, use the stream_content method:

from inference_gateway.client import Provider, Role, Message

messages = [
    Message(
      Role.SYSTEM, 
      "You are an helpful assistant"
    ),
    Message(
      Role.USER, 
      "Hello!"
    ),
]

# Use SSE for streaming
for response in client.generate_content_stream(
    provider=Provider.Ollama,
    model="llama2",
    messages=messages,
    use_sse=true
):
print("Event: ", response["event"])
print("Assistant: ", response["data"]["content"])

# Or raw JSON response
for response in client.generate_content_stream(
    provider=Provider.GROQ,
    model="deepseek-r1",
    messages=messages,
    use_sse=false
):
print("Assistant: ", response.content)

Health Check

To check the health of the API, use the health_check method:

is_healthy = client.health_check()
print("API Status: ", "Healthy" if is_healthy else "Unhealthy")

License

This SDK is distributed under the MIT License, see LICENSE for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inference_gateway-0.3.0.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inference_gateway-0.3.0-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file inference_gateway-0.3.0.tar.gz.

File metadata

  • Download URL: inference_gateway-0.3.0.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for inference_gateway-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1d9ca57be531e688b72a3be93753753240c2d4964a42c764c4c35b894acb752a
MD5 517cde50c8cdf01dcca091de96747ace
BLAKE2b-256 500cbb512adb6a1fc1e624a5e28b6de8ab5b1c551548c1246f003b9e09540d81

See more details on using hashes here.

File details

Details for the file inference_gateway-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for inference_gateway-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 db30f1260ab2351d13d62cf896227ec39d02d3a64c3ab6e80b39b5e500829c53
MD5 6a5147f978ba91f79a11bbbd66131872
BLAKE2b-256 8387af45792272012e6c1b8c6bb209b3dec261e58c45930984a2af205fedfb4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page