Skip to main content

Official Python SDK for Vectorless — structure-preserving document retrieval without embeddings

Project description

vectorless-sdk

Official Python SDK for Vectorless — structure-preserving document retrieval.

PyPI version PyPI downloads Python versions mypy strict CI License


Install

pip install vectorless-sdk

Quick Start

from vectorless import VectorlessClient

# Deployed instance with API key
client = VectorlessClient(
    base_url="https://api.vectorless.dev",
    api_key="vl_live_...",
)

# Self-hosted, no auth needed
# client = VectorlessClient(base_url="http://localhost:8080")

# 1. Ingest a document
result = client.ingest_document(
    "./research-paper.pdf",
    filename="research-paper.pdf",
    metadata={"department": "engineering"},
)

# 2. Wait for processing (parsing → summarizing → ready)
doc = client.wait_for_ready(
    result.document_id,
    on_progress=lambda s: print(f"Status: {s}"),
)

# 3. Explore the document tree
tree = client.get_document_tree(doc.id)
for section in tree.sections:
    print("  " * section.depth + f"{section.title} ({section.tokens} tokens)")

# 4. Query — LLM navigates the tree to find relevant sections
response = client.query(doc.id, "What methodology was used?")
for section in response.sections:
    print(f"\n## {section.title}\n{section.content}")
print(f"Strategy: {response.strategy} | {response.elapsed_ms}ms")

Async Support

Every method has an async counterpart:

from vectorless import AsyncVectorlessClient

async with AsyncVectorlessClient(
    base_url="https://api.vectorless.dev",
    api_key="vl_live_...",
) as client:
    result = await client.ingest_document(b"# Hello\n\nWorld", filename="hello.md")
    doc = await client.wait_for_ready(result.document_id)

    # Concurrent section fetch
    sections = await client.get_sections(["sec_1", "sec_2", "sec_3"])

Transport Protocols

Choose the wire protocol at init time:

# HTTP/REST — default, uses httpx
client = VectorlessClient(transport="http")

# ConnectRPC — protobuf JSON encoding, native streaming
client = VectorlessClient(transport="connect")
┌────────────────────────────────────────────────┐
│   VectorlessClient / AsyncVectorlessClient     │
├────────────────────────────────────────────────┤
│            Transport Abstraction                │
│  ┌──────────────────┐ ┌─────────────────────┐ │
│  │  HttpTransport    │ │  ConnectTransport   │ │
│  │  AsyncHttpTransport│ │AsyncConnectTransport│ │
│  │  REST/JSON        │ │  ConnectRPC JSON    │ │
│  │  SSE streaming    │ │  Native streaming   │ │
│  │  httpx            │ │  httpx              │ │
│  └────────┬─────────┘ └──────────┬──────────┘ │
│           │                       │            │
│     vectorless-server       vectorless-server   │
└────────────────────────────────────────────────┘

Streaming Queries

Watch retrieval progress in real-time:

# Sync
for event in client.query_stream(doc_id, "Explain the results"):
    if event.type == "section_selected" and event.section:
        print(f"Found: {event.section.title}")
    if event.type == "completed":
        print(f"Done in {event.elapsed_ms}ms")

# Async
async for event in client.query_stream(doc_id, "Explain the results"):
    ...

API Reference

Client Configuration

Parameter Type Default Description
base_url str "http://localhost:8080" Server URL
api_key str env VECTORLESS_API_KEY Bearer token
transport "http" | "connect" "http" Wire protocol
timeout float 30.0 Request timeout (seconds)
max_retries int 3 Retry attempts
retry_delay float 0.5 Base retry delay (seconds)

Methods

Both VectorlessClient (sync) and AsyncVectorlessClient (async) expose:

Method Returns Description
health() HealthResponse Server liveness check
version() VersionResponse Server build version
ingest_document(source, **opts) IngestDocumentResponse Upload a document
get_document(id) Document Get document metadata
list_documents(**opts) ListDocumentsResponse Paginated document list
delete_document(id) None Delete document + sections
wait_for_ready(id, **opts) Document Poll until processed
get_document_tree(id) DocumentTree Hierarchical outline
get_section(id) Section Full section content
get_sections(ids) list[Section] Multi-section fetch
query(doc_id, query, **opts) QueryResponse Retrieve relevant sections
query_stream(doc_id, query, **opts) Iterator[QueryStreamEvent] Stream results
close() None Release resources

Error Handling

from vectorless import (
    VectorlessError,
    AuthenticationError,
    NotFoundError,
    RateLimitError,
)

try:
    doc = client.get_document("doc_123")
except NotFoundError:
    print("Document not found")
except AuthenticationError:
    print("Check your API key")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after}s")
except VectorlessError as e:
    print(f"Error {e.status}: {e.message}")
Exception Status When
AuthenticationError 401 Missing or invalid API key
PermissionDeniedError 403 Insufficient permissions
NotFoundError 404 Document or section not found
ValidationError 400 Invalid request parameters
ConflictError 409 Idempotency conflict
RateLimitError 429 Too many requests
TimeoutError 408 Request timed out
ServerError 500 Internal server error
DocumentFailedError 422 Document processing failed
StreamError Stream interrupted

Context Manager

# Sync
with VectorlessClient(api_key="vl_...") as client:
    doc = client.get_document("doc_123")

# Async
async with AsyncVectorlessClient(api_key="vl_...") as client:
    doc = await client.get_document("doc_123")

Environment Variables

Variable Description
VECTORLESS_API_KEY API key fallback
VECTORLESS_BASE_URL Base URL fallback

Requirements

  • Python 3.9+
  • Dependencies: httpx, pydantic

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectorless_sdk-1.0.1.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectorless_sdk-1.0.1-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file vectorless_sdk-1.0.1.tar.gz.

File metadata

  • Download URL: vectorless_sdk-1.0.1.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectorless_sdk-1.0.1.tar.gz
Algorithm Hash digest
SHA256 994ec64ed67a7bca749c1c523c98f938a611ceb6bfa933a75102a9a31cda71ba
MD5 e8b59b0120b55eea58ac1e7f3782ff15
BLAKE2b-256 0f5835e05e867842d159a5a01c208b8a3cb485a795ae1ea5704bddd5bfd64cc6

See more details on using hashes here.

File details

Details for the file vectorless_sdk-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: vectorless_sdk-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectorless_sdk-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bc4175955d68bdf3536d3b00314ee2dc303493757ddaf845c0d7a42a9021d8a3
MD5 012aa6b6c9d01f49c0d6a26ddb64bbe1
BLAKE2b-256 83b05ca4d4e6578ed573b4d13099d628a88444e13261af466f738314a9f460d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page