Skip to main content

Python SDK for the Next Plaid ColBERT Search API

Project description

Next Plaid Client

A Python client library for the Next Plaid ColBERT Search API.

Installation

pip install next-plaid-client

Or install from source:

cd python-sdk
pip install -e .

For development (includes test dependencies):

pip install -e ".[dev]"

Quick Start

from next_plaid_client import NextPlaidClient, IndexConfig, SearchParams

# Create a client
client = NextPlaidClient("http://localhost:8080")

# Check server health
health = client.health()
print(f"Server status: {health.status}")
print(f"Loaded indices: {health.loaded_indices}")

# Create an index
config = IndexConfig(nbits=4, max_documents=10000)
client.create_index("my_index", config)

# Add documents with pre-computed embeddings
documents = [{"embeddings": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]]}]
metadata = [{"title": "Document 1", "category": "science"}]
client.add_documents("my_index", documents, metadata)

# Or use text encoding (requires model loaded on server)
client.update_documents_with_encoding(
    "my_index",
    documents=["Paris is the capital of France."],
    metadata=[{"title": "Geography"}]
)

# Search with embeddings
results = client.search(
    "my_index",
    queries=[[[0.1, 0.2, 0.3]]],
    params=SearchParams(top_k=10)
)

# Or search with text (requires model)
results = client.search_with_encoding(
    "my_index",
    queries=["What is the capital of France?"],
    params=SearchParams(top_k=5)
)

# Print results
for result in results.results:
    for doc_id, score in zip(result.document_ids, result.scores):
        print(f"Document {doc_id}: {score:.4f}")

API Reference

Client Initialization

client = NextPlaidClient(
    base_url="http://localhost:8080",  # API server URL
    timeout=30.0,                       # Request timeout in seconds
    headers={"Authorization": "..."}    # Optional headers
)

Health & Monitoring

health = client.health()
# Returns: HealthResponse with status, version, loaded_indices, indices info

Index Management

# List all indices
indices = client.list_indices()  # Returns: List[str]

# Get index info
info = client.get_index("my_index")  # Returns: IndexInfo

# Create index
config = IndexConfig(nbits=4, max_documents=10000)
client.create_index("my_index", config)

# Update index config
client.update_index_config("my_index", max_documents=5000)

# Delete index
client.delete_index("my_index")

Document Management

from next_plaid_client import Document

# Add documents with embeddings
doc = Document(embeddings=[[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
client.add_documents("my_index", [doc], metadata=[{"title": "Doc 1"}])

# Update documents (batched)
client.update_documents("my_index", documents=[...], metadata=[...])

# Update with text encoding (requires model)
client.update_documents_with_encoding(
    "my_index",
    documents=["Text to encode..."],
    metadata=[{"title": "Doc"}]
)

# Delete documents
result = client.delete_documents("my_index", document_ids=[5, 10, 15])
# Returns: DeleteDocumentsResponse with deleted count

Search Operations

from next_plaid_client import SearchParams

params = SearchParams(
    top_k=10,           # Number of results per query
    n_ivf_probe=8,      # IVF cells to probe
    n_full_scores=4096  # Documents for re-ranking
)

# Search with embeddings
results = client.search("my_index", queries=[...], params=params)

# Search within a subset
results = client.search("my_index", queries=[...], subset=[0, 5, 10])

# Search with metadata filter
results = client.search_filtered(
    "my_index",
    queries=[...],
    filter_condition="category = ? AND score > ?",
    filter_parameters=["science", 90]
)

# Search with text queries (requires model)
results = client.search_with_encoding("my_index", queries=["query text"])

# Filtered search with text (requires model)
results = client.search_filtered_with_encoding(
    "my_index",
    queries=["query text"],
    filter_condition="category = ?",
    filter_parameters=["science"]
)

Metadata Management

# Get all metadata
metadata = client.get_metadata("my_index")

# Add metadata
client.add_metadata("my_index", metadata=[{"title": "Doc 1"}])

# Get metadata count
count = client.get_metadata_count("my_index")

# Check which documents exist
check = client.check_metadata("my_index", document_ids=[0, 5, 10])

# Query metadata with SQL conditions
result = client.query_metadata(
    "my_index",
    condition="category = ?",
    parameters=["science"]
)

# Get metadata by IDs or condition
metadata = client.get_metadata_by_ids("my_index", document_ids=[0, 5])

Text Encoding

# Encode texts (requires model)
result = client.encode(
    texts=["Hello world", "Test document"],
    input_type="document"  # or "query"
)
# Returns: EncodeResponse with embeddings

Exception Handling

from next_plaid_client import (
    NextPlaidError,
    IndexNotFoundError,
    IndexExistsError,
    ValidationError,
    RateLimitError,
    ModelNotLoadedError,
)

try:
    client.get_index("nonexistent")
except IndexNotFoundError as e:
    print(f"Index not found: {e.message}")
except ValidationError as e:
    print(f"Invalid request: {e.message}")
except RateLimitError as e:
    print(f"Rate limited: {e.message}")
except ModelNotLoadedError as e:
    print(f"Model not loaded: {e.message}")
except NextPlaidError as e:
    print(f"API error: {e.message} (code: {e.code})")

Context Manager

with NextPlaidClient("http://localhost:8080") as client:
    health = client.health()
    # Session is automatically closed when exiting the context

Async Client

The library also provides an async client for use with asyncio:

import asyncio
from next_plaid_client import AsyncNextPlaidClient, IndexConfig, SearchParams

async def main():
    # Using async context manager (recommended)
    async with AsyncNextPlaidClient("http://localhost:8080") as client:
        # Check server health
        health = await client.health()
        print(f"Server status: {health.status}")

        # Create an index
        config = IndexConfig(nbits=4, max_documents=10000)
        await client.create_index("my_index", config)

        # Search with text queries (requires model)
        results = await client.search_with_encoding(
            "my_index",
            queries=["What is the capital of France?"],
            params=SearchParams(top_k=5)
        )

        for result in results.results:
            for doc_id, score in zip(result.document_ids, result.scores):
                print(f"Document {doc_id}: {score:.4f}")

# Run the async function
asyncio.run(main())

Async Methods

All methods available in NextPlaidClient are also available in AsyncNextPlaidClient with the same signatures, but they return coroutines that must be awaited:

# Health & Monitoring
health = await client.health()

# Index Management
indices = await client.list_indices()
info = await client.get_index("my_index")
await client.create_index("my_index", config)
await client.delete_index("my_index")
await client.update_index_config("my_index", max_documents=5000)

# Document Management
await client.add_documents("my_index", documents, metadata)
await client.update_documents("my_index", documents)
await client.update_documents_with_encoding("my_index", ["text..."])
result = await client.delete_documents("my_index", [0, 1, 2])

# Search
results = await client.search("my_index", queries)
results = await client.search_filtered("my_index", queries, "category = ?", ["science"])
results = await client.search_with_encoding("my_index", ["query text"])
results = await client.search_filtered_with_encoding("my_index", ["query"], "category = ?", ["science"])

# Metadata
metadata = await client.get_metadata("my_index")
await client.add_metadata("my_index", [{"title": "Doc 1"}])
count = await client.get_metadata_count("my_index")
check = await client.check_metadata("my_index", [0, 1, 2])
result = await client.query_metadata("my_index", "category = ?", ["science"])
metadata = await client.get_metadata_by_ids("my_index", document_ids=[0, 1])

# Encoding
result = await client.encode(["Hello world"], input_type="document")

Running Tests

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=next_plaid_client --cov-report=html

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

next_plaid_client-0.1.0.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

next_plaid_client-0.1.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file next_plaid_client-0.1.0.tar.gz.

File metadata

  • Download URL: next_plaid_client-0.1.0.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for next_plaid_client-0.1.0.tar.gz
Algorithm Hash digest
SHA256 17f80e301b6a2113a6aeed23df8f53796d8579da2c746f76327a423a55816fda
MD5 09a0041d0037543a004ba449a5241d96
BLAKE2b-256 f92d164caf5c3aca13bc0da2b91e38ffaba76b3315b99668b52be53ff0166fa8

See more details on using hashes here.

File details

Details for the file next_plaid_client-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for next_plaid_client-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8cb8537876ab8666ba5eb2ced28014563cc71ed6f4ad9d97f1d1e63995eeb1bb
MD5 ac5bc9d1da156e9b8d97d2779de2a94d
BLAKE2b-256 93a9b4d68d3b0539c18c3fbfc4ffffd51e8dc33b9724c2206cc47c8c2f8ea7ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page