LangChain VectorStore integration for Seahorse API Gateway

These details have not been verified by PyPI

Project description

LangChain Seahorse VectorStore

LangChain VectorStore integration for Seahorse API Gateway - A high-performance vector database for semantic search and RAG applications.

Features

LangChain Compatible: Full implementation of LangChain VectorStore interface
Schema-Aware Column Resolution: Dense and sparse vector columns are auto-resolved from GET /v2/data/schema
Hybrid Search: Dense, Sparse, and Hybrid (RRF) search modes
Dual Embedding Support: Use Seahorse's built-in embeddings or bring your own (OpenAI, Cohere, etc.)
Metadata Filtering: Filter search results by metadata
Batch Processing: Efficient handling of large datasets (auto-batched; max 50 rows/request, max 32KB text/row)
Indexing & Health Monitoring: get_indexed_row_count() returns a typed IndexedRowCount model for tracking index build progress; health() provides a drop-in liveness probe
Type-Safe: Complete type hints for Python 3.8+
Well-Tested: Comprehensive unit and integration tests

Installation

# Using pip
pip install langchain-seahorse

# Using uv (recommended)
uv add langchain-seahorse

Quick Start

Basic Usage with Built-in Embeddings

from seahorse_vector_store import SeahorseVectorStore

# Initialize vectorstore
vectorstore = SeahorseVectorStore(
    api_key="your-seahorse-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
)

# Add documents
ids = vectorstore.add_texts(
    texts=[
        "Machine learning is a subset of AI.",
        "Deep learning uses neural networks.",
    ],
    metadatas=[
        {"source": "doc1.pdf", "page": 1},
        {"source": "doc2.pdf", "page": 5},
    ]
)

# Search
docs = vectorstore.similarity_search(
    query="What is machine learning?",
    k=2
)

for doc in docs:
    print(doc.page_content)
    print(doc.metadata)

Using External Embeddings

from seahorse_vector_store import SeahorseVectorStore
from langchain_openai import OpenAIEmbeddings

vectorstore = SeahorseVectorStore(
    api_key="your-seahorse-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
    embedding=OpenAIEmbeddings(api_key="your-openai-key"),
    use_builtin_embedding=False,
)

# Use as normal...

Hybrid Search (Dense + Sparse)

from seahorse_vector_store import SeahorseVectorStore, SearchMode

vectorstore = SeahorseVectorStore(
    api_key="your-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
)

# Default: Hybrid search (Dense + Sparse with RRF fusion)
docs = vectorstore.similarity_search("machine learning", k=5)

# Pure Dense search
docs = vectorstore.similarity_search(
    "machine learning", k=5, retrieval_mode=SearchMode.DENSE
)

# Pure Sparse search (BM25-based)
docs = vectorstore.similarity_search(
    "machine learning", k=5, retrieval_mode=SearchMode.SPARSE
)

Metadata Filtering

# Search with metadata filter
docs = vectorstore.similarity_search(
    query="neural networks",
    k=5,
    filter={"source": "doc1.pdf", "page": 1}
)

Indexing Status & Health Check

# Per-index indexing progress (typed model). Top-level counts are
# writer-based; ``stats.readable`` adds a reader-node view (segment dedup
# + ``row_count - deleted_row_count`` saturating).
stats = vectorstore.get_indexed_row_count()
print(stats.total_row_count)
for idx in stats.indexed_counts:
    print(f"{idx.index_name} ({idx.index_type}): {idx.indexed_row_count}")

# Skip the reader-node ``readable`` view when only writer counts are needed
stats = vectorstore.get_indexed_row_count(readable=False)

# Lightweight liveness probe — True on 200 OK, False on any SeahorseAPIError
if not vectorstore.health():
    raise RuntimeError("Seahorse backend is unreachable")

🔧 Configuration

Environment Variables

You can set API credentials via environment variables:

export SEAHORSE_API_KEY="your-api-key"
export SEAHORSE_BASE_URL="https://your-table-uuid.api.seahorse.dnotitia.ai"

Then use them in your code:

import os
from seahorse_vector_store import SeahorseVectorStore

vectorstore = SeahorseVectorStore(
    api_key=os.environ["SEAHORSE_API_KEY"],
    base_url=os.environ["SEAHORSE_BASE_URL"],
)

Advanced Options

vectorstore = SeahorseVectorStore(
    api_key="your-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
    use_builtin_embedding=True,  # Use Seahorse embeddings
    # dense_column / sparse_column are optional explicit overrides.
    # If omitted, the SDK resolves them from GET /v2/data/schema.
)

Primary Key Behavior

Seahorse uses mandatory content-hash primary keys.

add_texts() and from_texts() always return IDs generated from Seahorse PK rules.
Caller-provided custom IDs, including LangChain Document.id, are not persisted as the stored row ID.
Use the returned ids from insert operations as the source of truth for later delete workflows.

📖 API Reference

SeahorseVectorStore

Main class for interacting with Seahorse as a vector store.

Synchronous Methods

add_texts(texts, metadatas=None, **kwargs) - Add texts to the vector store
similarity_search(query, k=4, filter=None, **kwargs) - Search for similar documents
similarity_search_with_score(query, k=4, filter=None, **kwargs) - Search with distance scores
similarity_search_by_vector(embedding, k=4, filter=None, **kwargs) - Search by vector
similarity_search_by_vector_with_score(embedding, k=4, filter=None, **kwargs) - Search by vector with scores
delete(ids=None, **kwargs) - Delete documents by IDs
from_texts(texts, embedding=None, metadatas=None, **kwargs) - Create vectorstore from texts
get_indexed_row_count(readable=True) - Per-index indexed row counts as IndexedRowCount
health() - Lightweight liveness probe (returns bool)

Async Methods

aadd_texts(texts, metadatas=None, **kwargs) - Add texts asynchronously
asimilarity_search(query, k=4, filter=None, **kwargs) - Search asynchronously
asimilarity_search_with_score(query, k=4, filter=None, **kwargs) - Search with scores asynchronously
asimilarity_search_by_vector(embedding, k=4, filter=None, **kwargs) - Search by vector asynchronously
asimilarity_search_by_vector_with_score(embedding, k=4, filter=None, **kwargs) - Search by vector with scores asynchronously
adelete(ids=None, **kwargs) - Delete documents asynchronously
aget_indexed_row_count(readable=True) - Per-index indexed row counts (async)
ahealth() - Async liveness probe

Search Modes

SearchMode.HYBRID (default) - Dense + Sparse with RRF fusion
SearchMode.DENSE - Pure dense vector search
SearchMode.SPARSE - Pure sparse (BM25) search

Not Supported

max_marginal_relevance_search() - ⚠️ MMR search is not supported by Seahorse API

Testing

Setup for Integration Tests

Create a .env file in the project root with your Seahorse credentials:

# Copy the example file
cp .env.example .env

# Edit .env and add your credentials
SEAHORSE_API_KEY=your-api-key
SEAHORSE_BASE_URL=https://your-table-uuid.api.seahorse.dnotitia.ai

Running Tests

# Run unit tests
uv run pytest tests/unit/

# Run basic integration tests (requires .env file with API credentials)
uv run pytest tests/integration/ \
  --ignore=tests/integration/test_ollama_embeddings.py \
  --ignore=tests/integration/test_rag_pipeline.py

# Run all tests with coverage
uv run pytest --cov=seahorse_vector_store --cov-report=term-missing

# Skip integration tests
uv run pytest -m "not integration"

Running Ollama Integration Tests (Optional)

For advanced tests using Ollama LLM and embeddings:

# 1. Install Ollama dependencies (Python 3.9+ required)
uv pip install langchain langchain-ollama

# 2. Start Ollama server
ollama serve

# 3. Download models
ollama pull qwen3-embedding:8b  # For embeddings
ollama pull qwen3:8b             # For RAG

# 4. Run Ollama tests
uv run pytest tests/integration/test_ollama_embeddings.py -v
uv run pytest tests/integration/test_rag_pipeline.py -v

# 5. Run all integration tests (including Ollama)
uv run pytest tests/integration/ -v

Note: Ollama tests will automatically skip if Ollama is not available or required models are not installed.

Examples

See the examples/ directory for complete examples:

basic_usage.py - Basic vectorstore operations
async_usage.py - Async/await operations for better performance
rag_pipeline.py - Building a RAG (Retrieval-Augmented Generation) pipeline
metadata_filtering.py - Advanced metadata filtering techniques
external_embeddings.py - Using external embeddings (OpenAI, Cohere, etc.)

Documentation

API Reference - Complete API documentation
Tutorial

Requirements

Python 3.8+
langchain-core >= 0.2.0
httpx >= 0.27.0
pydantic >= 2.0.0

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

Console: Seahorse Console

Development Status

This package is in Beta stage. APIs are stabilizing.

Current version: 0.4.0

Made by the Seahorse Team

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.1

Apr 30, 2026

0.4.0

Apr 22, 2026

0.3.0

Apr 14, 2026

0.2.0

Mar 3, 2026

0.1.0

Nov 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_seahorse-0.4.1.tar.gz (265.8 kB view details)

Uploaded Apr 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langchain_seahorse-0.4.1-py3-none-any.whl (29.0 kB view details)

Uploaded Apr 30, 2026 Python 3

File details

Details for the file langchain_seahorse-0.4.1.tar.gz.

File metadata

Download URL: langchain_seahorse-0.4.1.tar.gz
Upload date: Apr 30, 2026
Size: 265.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for langchain_seahorse-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`a571be0e836bfe1f2a074d6e7c0580d7c934782d69a414bd1197255b2843f669`
MD5	`8f6ac5ddea8a6f53380f7f9410f5a5b0`
BLAKE2b-256	`e769afe3bae87b81af87939f706d60925bd3645c4bf16e1a3d8dec798656e25e`

See more details on using hashes here.

File details

Details for the file langchain_seahorse-0.4.1-py3-none-any.whl.

File metadata

Download URL: langchain_seahorse-0.4.1-py3-none-any.whl
Upload date: Apr 30, 2026
Size: 29.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for langchain_seahorse-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e794319fb6ccacdfccf00d47b4976d210229a5a85f2c999ff96b6c3848f19c1b`
MD5	`39a492d867f07565a6f1c2caa3464243`
BLAKE2b-256	`fd89f913d3af61234b8e505fb07a89af5e6727d2676fcfff8f24580bf5b128b9`

See more details on using hashes here.

langchain-seahorse 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

LangChain Seahorse VectorStore

Features

Installation

Quick Start

Basic Usage with Built-in Embeddings

Using External Embeddings

Hybrid Search (Dense + Sparse)

Metadata Filtering

Indexing Status & Health Check

🔧 Configuration

Environment Variables

Advanced Options

Primary Key Behavior

📖 API Reference

SeahorseVectorStore

Synchronous Methods

Async Methods

Search Modes

Not Supported

Testing

Setup for Integration Tests

Running Tests

Running Ollama Integration Tests (Optional)

Examples

Documentation

Requirements

License

Contributing

Support

Links

Development Status

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes