Skip to main content

LlamaIndex retriever integration for DigitalOcean Gradient Knowledge Base

Project description

LlamaIndex Retrievers Integration: DigitalOcean Gradient

PyPI version License: MIT

Native LlamaIndex retriever integration for DigitalOcean Gradient Knowledge Base as a Service (KBAas). This package provides seamless integration between Gradient's knowledge base retrieval and the LlamaIndex ecosystem.

Features

  • 🔌 Native LlamaIndex Integration - Works seamlessly with RetrieverQueryEngine and other LlamaIndex components
  • 📦 Automatic Format Conversion - Converts Gradient KB results to NodeWithScore objects
  • 🎯 Preserves Metadata - Maintains document IDs, chunk IDs, sources, and relevance scores
  • Async Support - Full support for both synchronous and asynchronous retrieval
  • 🔄 Simple API - Clean, intuitive interface following LlamaIndex patterns

Installation

pip install llama-index-retrievers-digitalocean-gradientai

Quick Start

Basic Usage

from llama_index.retrievers.digitalocean.gradientai import GradientKBRetriever

# Initialize retriever
retriever = GradientKBRetriever(
    knowledge_base_id="kb-your-uuid-here",
    api_token="your-digitalocean-access-token",  # DIGITALOCEAN_ACCESS_TOKEN
    num_results=5
)

# Direct retrieval
nodes = retriever.retrieve("What is machine learning?")

# Access results
for node in nodes:
    print(f"Score: {node.score}")
    print(f"Content: {node.node.text}")
    print(f"Metadata: {node.node.metadata}")

Integration with Query Engine

from llama_index.retrievers.digitalocean.gradientai import GradientKBRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.llms.digitalocean.gradientai import GradientAI

# Initialize retriever
retriever = GradientKBRetriever(
    knowledge_base_id="kb-your-uuid-here",
    api_token="your-digitalocean-access-token",  # DIGITALOCEAN_ACCESS_TOKEN
    num_results=5
)

# Initialize LLM (optional - for response generation)
llm = GradientAI(
    model="llama3.3-70b-instruct",
    model_access_key="your-model-access-key"  # MODEL_ACCESS_KEY
)

# Create query engine
query_engine = RetrieverQueryEngine.from_args(
    retriever=retriever,
    llm=llm
)

# Query with automatic retrieval + response generation
response = query_engine.query("Explain quantum computing")
print(response)

Async Usage

import asyncio
from llama_index.core import QueryBundle

async def async_retrieve():
    retriever = GradientKBRetriever(
        knowledge_base_id="kb-your-uuid-here",
        api_token="your-digitalocean-access-token"  # DIGITALOCEAN_ACCESS_TOKEN
    )

    query = QueryBundle(query_str="What is neural networks?")
    nodes = await retriever.aretrieve(query)

    return nodes

nodes = asyncio.run(async_retrieve())

Configuration Options

Parameter Type Default Description
knowledge_base_id str Required Gradient Knowledge Base UUID
api_token str Required DigitalOcean access token (DIGITALOCEAN_ACCESS_TOKEN)
num_results int 5 Number of results to retrieve
base_url str None Custom API base URL (optional)
timeout float 60.0 Request timeout in seconds

Why Use This Instead of Manual SDK Calls?

Before (Manual SDK Integration):

# ❌ Manual approach - lots of boilerplate
response = gradient_client.retrieve.documents(
    knowledge_base_id=kb_id,
    num_results=5,
    query=query
)

# Extract text manually
docs = [result.text_content for result in response.results
        if hasattr(result, 'text_content')]

# ❌ Loses scores, metadata, and can't use with LlamaIndex components

After (Native Retriever):

# ✅ Clean, native integration
retriever = GradientKBRetriever(knowledge_base_id=kb_id, api_token=token)
nodes = retriever.retrieve(query)

# ✅ Full NodeWithScore objects with metadata and scores
# ✅ Works with all LlamaIndex retrieval patterns
# ✅ Supports re-ranking, filtering, composition

What Gets Preserved

The retriever automatically captures and preserves:

  • Text Content - The retrieved document/chunk text
  • Relevance Score - Similarity/relevance score from Gradient
  • Document ID - Source document identifier
  • Chunk ID - Specific chunk identifier
  • Source - Document source/origin
  • Custom Metadata - Any additional metadata from Gradient

Advanced Usage

Combining with Other Retrievers

from llama_index.core.retrievers import BaseRetriever

class HybridGradientRetriever(BaseRetriever):
    """Combine Gradient KB with another retriever."""

    def __init__(self, gradient_retriever, other_retriever):
        self.gradient = gradient_retriever
        self.other = other_retriever
        super().__init__()

    def _retrieve(self, query_bundle):
        gradient_nodes = self.gradient.retrieve(query_bundle)
        other_nodes = self.other.retrieve(query_bundle)
        # Combine, deduplicate, rerank...
        return gradient_nodes + other_nodes

Using with Callbacks/Tracing

from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler

debug_handler = LlamaDebugHandler()
callback_manager = CallbackManager([debug_handler])

retriever = GradientKBRetriever(
    knowledge_base_id="kb-uuid",
    api_token="token",
    callback_manager=callback_manager
)

nodes = retriever.retrieve("query")
# View retrieval events in debug_handler

Requirements

  • Python 3.8+
  • llama-index-core>=0.10.0
  • gradient>=3.8.0

Related Packages

Development

# Clone repository
git clone https://github.com/digitalocean/llama-index-retrievers-digitalocean-gradientai
cd llama-index-retrievers-digitalocean-gradientai

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black .
ruff check . --fix

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Acknowledgments

Built with ❤️ for the LlamaIndex and DigitalOcean communities.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_retrievers_digitalocean_gradientai-0.1.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_retrievers_digitalocean_gradientai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1bf74dab624a85b48de60d6600b41912b58d7d4e794a4e5b37eb39a2b0b00083
MD5 c9af26a5b40bb9466442d9f48f04f720
BLAKE2b-256 eed88d7c8a8483a41ea7269b7f70f2a95adf51374913764094958f30e4ee3188

See more details on using hashes here.

File details

Details for the file llama_index_retrievers_digitalocean_gradientai-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_retrievers_digitalocean_gradientai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bfff6eb2f18a862151797b283b5c268c6e4c4b5c2ef2b17143dade42fa7a3682
MD5 5b07eb62aada68e24838cc16a1e99633
BLAKE2b-256 80ebdda412ef1b0c0a7df25387525090b43f380e8538fcb20a0ac95bda080861

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page