Skip to main content

An integration package connecting Superlinked and LangChain

Project description

langchain-superlinked

Integration package that exposes Superlinked retrieval capabilities via the standard LangChain retriever interface. It lets you plug a Superlinked-powered retriever into LangChain RAG pipelines while keeping your vector storage and schema choices flexible.

Install

pip install -U langchain-superlinked superlinked

Quickstart

import superlinked.framework as sl
from langchain_superlinked import SuperlinkedRetriever

class DocumentSchema(sl.Schema):
    id: sl.IdField
    content: sl.String

doc_schema = DocumentSchema()
text_space = sl.TextSimilaritySpace(text=doc_schema.content, model="sentence-transformers/all-MiniLM-L6-v2")
index = sl.Index([text_space])
query = (
    sl.Query(index)
    .find(doc_schema)
    .similar(text_space.text, sl.Param("query_text"))
    .select([doc_schema.content])
)

source = sl.InMemorySource(schema=doc_schema)
executor = sl.InMemoryExecutor(sources=[source], indices=[index])
app = executor.run()
source.put([
    {"id": "1", "content": "Machine learning processes data efficiently."},
    {"id": "2", "content": "NLP understands human language."},
])

retriever = SuperlinkedRetriever(sl_client=app, sl_query=query, page_content_field="content")
docs = retriever.invoke("artificial intelligence", k=2)

See more end-to-end examples in docs/.


Local development

Prerequisites: Python 3.10–3.13, uv installed.

  • Setup: uv sync --all-extras --dev && uv run pre-commit install
  • Lint & type-check: uv run ruff check . && uv run ruff format --check . && uv run mypy langchain_superlinked
  • Unit tests: make test
  • Integration tests: make integration_tests (skips if langchain_tests isn’t installed)
  • Smoke test: make smoke
  • Run examples: uv run python docs/quickstart_examples.py

CI/CD overview

On push/PR to main, GitHub Actions runs (matrix: 3.10/3.11/3.12):

  • Lint: ruff check . and ruff format --check .
  • Type-check: mypy langchain_superlinked
  • Tests: unit (network disabled) and integration (skips if standard tests unavailable)
  • Smoke test: imports the package and symbols
  • Build: python -m build to produce sdist and wheel (no publish)

Workflow file: .github/workflows/ci.yml.


Releasing

  • Preferred: tag-based OIDC publish
    • Ensure PyPI Trusted Publisher is configured for this repo.
    • Bump version in pyproject.toml using semantic versioning.
    • Tag and push: git tag vX.Y.Z && git push origin vX.Y.Z
    • CI will build and publish automatically.
  • Manual (fallback):
    • Build artifacts: make dist
    • Validate: uv run twine check dist/*
    • Publish to PyPI: uv run twine upload -r pypi dist/*

After publish, open/refresh the docs PR in the LangChain monorepo to reference the new version if needed. See LangChain’s integration guide for the process: How to contribute an integration.


Implementation overview

  • Primary entrypoint: langchain_superlinked/retrievers.py exposes SuperlinkedRetriever, a BaseRetriever.
  • Construction:
    • sl_client: Superlinked App (e.g., from InMemoryExecutor.run()).
    • sl_query: Superlinked QueryDescriptor built via sl.Query(...).find(...).similar(...).select(...).
    • page_content_field: field from Superlinked results mapped to Document.page_content.
    • Optional metadata_fields: copied into Document.metadata in addition to the always-present id.
  • Behavior:
    • Accepts runtime parameters (e.g., k, weights, filters) and forwards them to the Superlinked query.
    • Handles missing fields gracefully; returns an empty list on upstream exceptions.

Scope and non-goals

This package aims to be the minimal, well-typed LangChain integration layer for Superlinked retrievers. It intentionally does not include:

  • Dynamic schema inference or auto-generation for arbitrary datasets. Rationale: datasets vary widely; a robust solution requires additional assumptions (typing, transforms, index strategy), which goes beyond the minimal integration. We recommend implementing this in a separate helper package or cookbook code layered on top (e.g., “schema builders” that emit Superlinked schemas and indices for your domain). The examples in docs/ illustrate patterns for composing spaces (text, categorical, numeric, recency) that such builders could automate.
  • Non-retriever integrations (custom LLMs, embeddings, caches, loaders). These can live in separate packages if needed.

If you have concrete requirements for dynamic schema construction, please open an issue with sample data and desired retrieval behavior so we can discuss an extensible approach that stays decoupled from the core integration.


Links

License

MIT (see LICENSE)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_superlinked-0.1.1.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_superlinked-0.1.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file langchain_superlinked-0.1.1.tar.gz.

File metadata

  • Download URL: langchain_superlinked-0.1.1.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langchain_superlinked-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4ca3f4a0dd2772a2d73369117ac5ae1d67b130e4609584f1b87f12ef98ebfb01
MD5 450b351992d546947400c4609c9aafae
BLAKE2b-256 6be74c794fae18f41fa8ed04c8041c7a65b3816c726a98019c97114f896a0d58

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_superlinked-0.1.1.tar.gz:

Publisher: publish.yml on superlinked/langchain-superlinked

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langchain_superlinked-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_superlinked-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e5e78bfaf7f38cfcb5aa4ab92af827ee37289a0d8cab75ae8328ad36a0c46ae3
MD5 57820e20f94902eec6edefb57e73d0a0
BLAKE2b-256 f4caa97f52ffa3de77b46fd8bcd0f5bfcd35b0293b1d2448fa3f2ebc54b3673b

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_superlinked-0.1.1-py3-none-any.whl:

Publisher: publish.yml on superlinked/langchain-superlinked

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page