Skip to main content

LangChain VectorStore integration for Seahorse API Gateway

Project description

LangChain Seahorse VectorStore

Python Version License

LangChain VectorStore integration for Seahorse API Gateway - A high-performance vector database for semantic search and RAG applications.

Features

  • LangChain Compatible: Full implementation of LangChain VectorStore interface
  • Hybrid Search: Dense, Sparse, and Hybrid (RRF) search modes
  • Dual Embedding Support: Use Seahorse's built-in embeddings or bring your own (OpenAI, Cohere, etc.)
  • Metadata Filtering: Filter search results by metadata
  • Batch Processing: Efficient handling of large datasets (auto-batched; max 50 rows/request, max 32KB text/row)
  • Type-Safe: Complete type hints for Python 3.8+
  • Well-Tested: Comprehensive unit and integration tests

Installation

# Using pip
pip install langchain-seahorse

# Using uv (recommended)
uv add langchain-seahorse

Quick Start

Basic Usage with Built-in Embeddings

from seahorse_vector_store import SeahorseVectorStore

# Initialize vectorstore
vectorstore = SeahorseVectorStore(
    api_key="your-seahorse-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
)

# Add documents
ids = vectorstore.add_texts(
    texts=[
        "Machine learning is a subset of AI.",
        "Deep learning uses neural networks.",
    ],
    metadatas=[
        {"source": "doc1.pdf", "page": 1},
        {"source": "doc2.pdf", "page": 5},
    ]
)

# Search
docs = vectorstore.similarity_search(
    query="What is machine learning?",
    k=2
)

for doc in docs:
    print(doc.page_content)
    print(doc.metadata)

Using External Embeddings

from seahorse_vector_store import SeahorseVectorStore
from langchain_openai import OpenAIEmbeddings

vectorstore = SeahorseVectorStore(
    api_key="your-seahorse-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
    embedding=OpenAIEmbeddings(api_key="your-openai-key"),
    use_builtin_embedding=False,
)

# Use as normal...

Hybrid Search (Dense + Sparse)

from seahorse_vector_store import SeahorseVectorStore, SearchMode

vectorstore = SeahorseVectorStore(
    api_key="your-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
)

# Default: Hybrid search (Dense + Sparse with RRF fusion)
docs = vectorstore.similarity_search("machine learning", k=5)

# Pure Dense search
docs = vectorstore.similarity_search(
    "machine learning", k=5, retrieval_mode=SearchMode.DENSE
)

# Pure Sparse search (BM25-based)
docs = vectorstore.similarity_search(
    "machine learning", k=5, retrieval_mode=SearchMode.SPARSE
)

Metadata Filtering

# Search with metadata filter
docs = vectorstore.similarity_search(
    query="neural networks",
    k=5,
    filter={"source": "doc1.pdf", "page": 1}
)

🔧 Configuration

Environment Variables

You can set API credentials via environment variables:

export SEAHORSE_API_KEY="your-api-key"
export SEAHORSE_BASE_URL="https://your-table-uuid.api.seahorse.dnotitia.ai"

Then use them in your code:

import os
from seahorse_vector_store import SeahorseVectorStore

vectorstore = SeahorseVectorStore(
    api_key=os.environ["SEAHORSE_API_KEY"],
    base_url=os.environ["SEAHORSE_BASE_URL"],
)

Advanced Options

vectorstore = SeahorseVectorStore(
    api_key="your-api-key",
    base_url="https://your-table-uuid.api.seahorse.dnotitia.ai",
    use_builtin_embedding=True,  # Use Seahorse embeddings
    dense_column="dense_vector",   # Dense vector column name
    sparse_column="sparse_vector", # Sparse vector column name
)

📖 API Reference

SeahorseVectorStore

Main class for interacting with Seahorse as a vector store.

Synchronous Methods

  • add_texts(texts, metadatas=None, **kwargs) - Add texts to the vector store
  • similarity_search(query, k=4, filter=None, **kwargs) - Search for similar documents
  • similarity_search_with_score(query, k=4, filter=None, **kwargs) - Search with distance scores
  • similarity_search_by_vector(embedding, k=4, filter=None, **kwargs) - Search by vector
  • similarity_search_by_vector_with_score(embedding, k=4, filter=None, **kwargs) - Search by vector with scores
  • delete(ids=None, **kwargs) - Delete documents by IDs
  • from_texts(texts, embedding=None, metadatas=None, **kwargs) - Create vectorstore from texts

Async Methods

  • aadd_texts(texts, metadatas=None, **kwargs) - Add texts asynchronously
  • asimilarity_search(query, k=4, filter=None, **kwargs) - Search asynchronously
  • asimilarity_search_with_score(query, k=4, filter=None, **kwargs) - Search with scores asynchronously
  • asimilarity_search_by_vector(embedding, k=4, filter=None, **kwargs) - Search by vector asynchronously
  • asimilarity_search_by_vector_with_score(embedding, k=4, filter=None, **kwargs) - Search by vector with scores asynchronously
  • adelete(ids=None, **kwargs) - Delete documents asynchronously

Search Modes

  • SearchMode.HYBRID (default) - Dense + Sparse with RRF fusion
  • SearchMode.DENSE - Pure dense vector search
  • SearchMode.SPARSE - Pure sparse (BM25) search

Not Supported

  • max_marginal_relevance_search() - ⚠️ MMR search is not supported by Seahorse API

Testing

Setup for Integration Tests

Create a .env file in the project root with your Seahorse credentials:

# Copy the example file
cp .env.example .env

# Edit .env and add your credentials
SEAHORSE_API_KEY=your-api-key
SEAHORSE_BASE_URL=https://your-table-uuid.api.seahorse.dnotitia.ai

Running Tests

# Run unit tests
uv run pytest tests/unit/

# Run basic integration tests (requires .env file with API credentials)
uv run pytest tests/integration/ \
  --ignore=tests/integration/test_ollama_embeddings.py \
  --ignore=tests/integration/test_rag_pipeline.py

# Run all tests with coverage
uv run pytest --cov=seahorse_vector_store --cov-report=term-missing

# Skip integration tests
uv run pytest -m "not integration"

Running Ollama Integration Tests (Optional)

For advanced tests using Ollama LLM and embeddings:

# 1. Install Ollama dependencies (Python 3.9+ required)
uv pip install langchain langchain-ollama

# 2. Start Ollama server
ollama serve

# 3. Download models
ollama pull qwen3-embedding:8b  # For embeddings
ollama pull qwen3:8b             # For RAG

# 4. Run Ollama tests
uv run pytest tests/integration/test_ollama_embeddings.py -v
uv run pytest tests/integration/test_rag_pipeline.py -v

# 5. Run all integration tests (including Ollama)
uv run pytest tests/integration/ -v

Note: Ollama tests will automatically skip if Ollama is not available or required models are not installed.

Examples

See the examples/ directory for complete examples:

  • basic_usage.py - Basic vectorstore operations
  • async_usage.py - Async/await operations for better performance
  • rag_pipeline.py - Building a RAG (Retrieval-Augmented Generation) pipeline
  • metadata_filtering.py - Advanced metadata filtering techniques
  • external_embeddings.py - Using external embeddings (OpenAI, Cohere, etc.)

Documentation

Requirements

  • Python 3.8+
  • langchain-core >= 0.2.0
  • httpx >= 0.27.0
  • pydantic >= 2.0.0

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

Links

Development Status

This package is in Beta stage. APIs are stabilizing.

Current version: 0.2.0


Made by the Seahorse Team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_seahorse-0.2.0.tar.gz (314.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_seahorse-0.2.0-py3-none-any.whl (25.8 kB view details)

Uploaded Python 3

File details

Details for the file langchain_seahorse-0.2.0.tar.gz.

File metadata

  • Download URL: langchain_seahorse-0.2.0.tar.gz
  • Upload date:
  • Size: 314.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for langchain_seahorse-0.2.0.tar.gz
Algorithm Hash digest
SHA256 28fb358b6fa676730dd15f5c3a465a3f72ac6a2d4aad369f6f986ac627c360ad
MD5 c9a2573121f381abe354b6afe1075351
BLAKE2b-256 f8ce175451a15b30cc4bddafaca070d1f2821057db8164976667f85b200e8537

See more details on using hashes here.

File details

Details for the file langchain_seahorse-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: langchain_seahorse-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for langchain_seahorse-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0be2924591b5ae6fb7e552ba8a9d91814351e6b1913f966b17b190ca1134148c
MD5 0a691e5890d1daa0664284bdbb685406
BLAKE2b-256 e8ca31942c12e1dd1efb9b2d76f9ddd42770e74832a74598e66520c95f4168ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page