LangChain integration package for RunPod Serverless AI endpoints.

These details have not been verified by PyPI

Project links

Project description

langchain-runpod

langchain-runpod integrates RunPod Serverless endpoints with LangChain.

It allows you to interact with custom large language models (LLMs) and chat models deployed on RunPod's cost-effective and scalable GPU infrastructure directly within your LangChain applications.

This package provides:

RunPod: For interacting with standard text-completion models.
ChatRunPod: For interacting with conversational chat models.

Installation

pip install -U langchain-runpod

Authentication

To use this integration, you need a RunPod API key.

Obtain your API key from the RunPod API Keys page.
Set it as an environment variable:

export RUNPOD_API_KEY="your-runpod-api-key"

Alternatively, you can pass the api_key directly when initializing the RunPod or ChatRunPod classes, though using environment variables is recommended for security.

Basic Usage

You will also need the Endpoint ID for your deployed RunPod Serverless endpoint. Find this in the RunPod console under Serverless -> Endpoints.

LLM (`RunPod`)

Use the RunPod class for standard LLM interactions (text completion).

import os
from langchain_runpod import RunPod

# Ensure API key is set (or pass it as api_key="...")
# os.environ["RUNPOD_API_KEY"] = "your-runpod-api-key"

llm = RunPod(
    endpoint_id="your-endpoint-id", # Replace with your actual Endpoint ID
    model_name="runpod-llm", # Optional: For metadata
    temperature=0.7,
    max_tokens=100,
)

# Synchronous call
prompt = "What is the capital of France?"
response = llm.invoke(prompt)
print(f"Sync Response: {response}")

# Async call
# response_async = await llm.ainvoke(prompt)
# print(f"Async Response: {response_async}")

# Streaming (Simulated)
# print("Streaming Response:")
# for chunk in llm.stream(prompt):
#     print(chunk, end="", flush=True)
# print()

Chat Model (`ChatRunPod`)

Use the ChatRunPod class for conversational interactions.

import os
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_runpod import ChatRunPod

# Ensure API key is set (or pass it as api_key="...")
# os.environ["RUNPOD_API_KEY"] = "your-runpod-api-key"

chat = ChatRunPod(
    endpoint_id="your-endpoint-id", # Replace with your actual Endpoint ID
    model_name="runpod-chat", # Optional: For metadata
    temperature=0.7,
    max_tokens=256,
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What are the planets in our solar system?"),
]

# Synchronous call
response = chat.invoke(messages)
print(f"Sync Response:\n{response.content}")

# Async call
# response_async = await chat.ainvoke(messages)
# print(f"Async Response:\n{response_async.content}")

# Streaming (Simulated)
# print("Streaming Response:")
# for chunk in chat.stream(messages):
#     print(chunk.content, end="", flush=True)
# print()

Features and Limitations

API Interaction

Asynchronous Execution: RunPod Serverless endpoints are inherently asynchronous. This integration handles the underlying polling mechanism for the /run and /status/{job_id} endpoints automatically for both RunPod and ChatRunPod classes.
Synchronous Endpoint: While RunPod offers a /runsync endpoint, this integration primarily uses the asynchronous /run -> /status flow for better compatibility and handling of potentially long-running jobs. Polling parameters (poll_interval, max_polling_attempts) can be configured during initialization.

Feature Support

The level of support for advanced LLM features depends heavily on the specific model and handler deployed on your RunPod endpoint. The RunPod API itself provides a generic interface.

Feature	Support Level	Notes
Core Invoke/Gen	✅ Supported	Basic text generation and chat conversations work as expected (sync & async).
Streaming	⚠️ Simulated	The `.stream()` and `.astream()` methods work by getting the full response first and then yielding it chunk by chunk. True token-level streaming requires a WebSocket-enabled RunPod endpoint handler.
Tool Calling	↔️ Endpoint Dependent	No built-in support via standardized RunPod API parameters. Depends entirely on the endpoint handler interpreting tool descriptions/schemas passed in the `input`. Standard tests skipped.
Structured Output	↔️ Endpoint Dependent	No built-in support via standardized RunPod API parameters. Depends on the endpoint handler's ability to generate structured formats (e.g., JSON) based on input instructions. Standard tests skipped.
JSON Mode	↔️ Endpoint Dependent	No dedicated `response_format` parameter at the RunPod API level. Depends on the endpoint handler. Standard tests skipped.
Token Usage	❌ Not Available	The RunPod API does not provide standardized token usage fields. Usage metadata tests are marked `xfail`. Any token info must come from the endpoint handler's custom output.
Logprobs	❌ Not Available	The RunPod API does not provide logprobs.
Image Input	↔️ Endpoint Dependent	Standard tests pass, likely by adapting image URLs/data. Actual support depends on the endpoint handler.

Important Notes

Endpoint Handler: Ensure your RunPod endpoint runs a compatible LLM server (e.g., vLLM, TGI, FastChat, text-generation-webui) that accepts standard inputs (like prompt or messages) and returns text output in a common format (direct string, or a dictionary containing keys like text, content, output, choices, etc.). The integration attempts to parse common formats, but custom handlers might require modifications to the parsing logic (e.g., overriding _process_response).

Setting Up a RunPod Endpoint

Go to RunPod Serverless in your RunPod console.
Click "New Endpoint".
Select a GPU and a suitable template (e.g., a template running vLLM, TGI, FastChat, or text-generation-webui with your desired model).
Configure settings (like FlashInfer, custom container image if needed) and deploy.
Once active, copy the Endpoint ID for use with this library.

For more details, refer to the RunPod Serverless Documentation.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Apr 3, 2025

0.1.0

Mar 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_runpod-0.2.0.tar.gz (16.3 kB view details)

Uploaded Apr 3, 2025 Source

Built Distribution

langchain_runpod-0.2.0-py3-none-any.whl (17.3 kB view details)

Uploaded Apr 3, 2025 Python 3

File details

Details for the file langchain_runpod-0.2.0.tar.gz.

File metadata

Download URL: langchain_runpod-0.2.0.tar.gz
Upload date: Apr 3, 2025
Size: 16.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.2 CPython/3.9.6 Darwin/24.3.0

File hashes

Hashes for langchain_runpod-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`a4ce4c0c07c4a1c9ad2e82c604cd80afdc3aa621f50af97d4022ecde2fd7d827`
MD5	`9fefff30a8981c018de9ed7ad4309566`
BLAKE2b-256	`de5081ccf81ff9244c1101d1e0ccb6ca085846527224a289af37bd578a844087`

See more details on using hashes here.

File details

Details for the file langchain_runpod-0.2.0-py3-none-any.whl.

File metadata

Download URL: langchain_runpod-0.2.0-py3-none-any.whl
Upload date: Apr 3, 2025
Size: 17.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.2 CPython/3.9.6 Darwin/24.3.0

File hashes

Hashes for langchain_runpod-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`796e9883fba724ab5f68d16bf9ab37847aedb75a314ae21bf85b76a607e7bfc4`
MD5	`c05b8672548d05fd2e49e830a04b384d`
BLAKE2b-256	`7bca991757756cff40a62b7402bf10e72b23e276993cb3968a4667eceaf42b64`

See more details on using hashes here.

langchain-runpod 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

langchain-runpod

Installation

Authentication

Basic Usage

LLM (`RunPod`)

Chat Model (`ChatRunPod`)

Features and Limitations

API Interaction

Feature Support

Important Notes

Setting Up a RunPod Endpoint

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

langchain-runpod 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

langchain-runpod

Installation

Authentication

Basic Usage

LLM (RunPod)

Chat Model (ChatRunPod)

Features and Limitations

API Interaction

Feature Support

Important Notes

Setting Up a RunPod Endpoint

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

LLM (`RunPod`)

Chat Model (`ChatRunPod`)