LangChain integration for Infino — vector, BM25, and hybrid retrieval over one engine on object storage.
Project description
langchain-infino
LangChain over Infino — vector, full-text (BM25), hybrid, and SQL-native retrieval over one copy of your data on object storage.
Most "vector database" LangChain integrations expose only the vector slice of
their engine. Infino keeps your data in Apache Parquet on object storage and
runs SQL, BM25, vector, and hybrid (RRF) retrieval over it from a single
in-process engine — no separate search cluster or vector store to keep in
sync. This package surfaces that whole retrieval surface, not just
similarity_search.
Infino never embeds: you bring a LangChain Embeddings object, and the
integration supplies the vectors.
Installation
pip install langchain-infino
Or with uv:
uv add langchain-infino
Requires Python 3.9+. infino, langchain-core, pyarrow, and numpy are
installed as dependencies. Bring your own embeddings provider separately (e.g.
pip install langchain-openai).
Quickstart
import infino
from langchain_infino import InfinoVectorStore
from langchain_openai import OpenAIEmbeddings
# A local path or an S3 URI for durable storage; "memory://" is ephemeral.
connection = infino.connect("./data")
embedding = OpenAIEmbeddings() # dim must match the table; 1536 here
store = InfinoVectorStore.from_texts(
["Infino runs search on object storage.", "One engine for SQL, BM25, and vectors."],
embedding,
connection=connection,
table_name="docs",
dim=1536,
)
docs = store.similarity_search("search on S3", k=2)
retriever = store.as_retriever()
Core concepts
InfinoVectorStorewraps a single Infino table — the text, its embedding, the document id, declared metadata columns, and a JSON catch-all. Usefrom_textsto create and populate one; construct directly to open an existing table.- Identity — caller-controlled ids live on
Document.id(not in metadata).add_textsis an idempotent upsert: re-adding an id overwrites, omitted ids are generated. - Metadata, two tiers — keys you name in
metadata_columns=become real scalar columns you can filter on; everything else round-trips losslessly through a JSON catch-all but isn't filterable. The schema is fixed at table creation — adding a filterable key means recreating the table. - Scores — vector distance is smaller is nearer; BM25 and RRF are
larger is better.
similarity_search_with_relevance_scoresnormalizes to[0, 1](higher = better) forcosine,l2, andl2sq. - Retrievers —
as_retriever()(vector),as_bm25_retriever()(lexical), andas_hybrid_retriever()(RRF fusion). - Dimensions — embeddings must be
[16, 4096]-dimensional (engine limit) and match the table's declareddim.
Adding and managing documents
# Generated ids on the common path; returns them.
ids = store.add_texts(["a new note"], metadatas=[{"source": "inbox"}])
# Caller ids are upserted — re-adding "doc-1" overwrites in place.
store.add_texts(["v2 of the note"], ids=["doc-1"])
# Fetch by id (skips missing, order not guaranteed); delete by id.
store.get_by_ids(["doc-1"])
store.delete(["doc-1"])
Similarity search
store.similarity_search("vector databases", k=4)
store.similarity_search_with_score("vector databases", k=4) # raw distance
store.similarity_search_with_relevance_scores("vector databases", k=4) # [0, 1]
store.similarity_search_by_vector(query_vector, k=4) # query_vector: list[float]
Metadata filtering
Promote the keys you want to filter on to real columns, then pass the
LangChain operator form. Supports equality, $eq / $ne / $gt / $gte /
$lt / $lte, $in / $nin, and $and / $or / $not.
import pyarrow as pa
store = InfinoVectorStore.from_texts(
texts, embedding,
connection=connection, table_name="papers", dim=1536,
metadata_columns=[
pa.field("category", pa.large_utf8(), nullable=False),
pa.field("year", pa.int64(), nullable=False),
],
metadatas=[{"category": "ml", "year": 2024} for _ in texts],
)
store.similarity_search("optimizers", k=4, filter={"category": "ml"})
store.similarity_search("optimizers", k=4, filter={"year": {"$gte": 2023}})
store.similarity_search("optimizers", k=4,
filter={"$or": [{"category": "ml"}, {"year": {"$lt": 2000}}]})
Text-pushdown pre-filter
For a text predicate, push it into the kNN instead of post-filtering the
top-k. The engine prunes to rows matching the full-text terms before
ranking, so exactly k nearest matching rows come back — no over-fetch, no
under-return. filter_mode is "or" (default) or "and"; filter_column
defaults to the text column.
store.similarity_search("cancel my plan", k=10, filter_query="subscription billing")
It is reachable from any retriever via search_kwargs:
retriever = store.as_retriever(search_kwargs={"k": 10, "filter_query": "billing"})
filter (structured, post-rank SQL WHERE) and filter_query (text,
pre-rank pushdown) are distinct paths and not combinable in one call.
Maximal marginal relevance (MMR)
store.max_marginal_relevance_search("transformers", k=4, fetch_k=20, lambda_mult=0.5)
Infino's vector column isn't projectable and there's no point-lookup, so MMR
re-embeds the fetch_k candidates' text to score them against each other.
Hybrid (RRF) retrieval
BM25 and vector search fused by reciprocal-rank fusion in a single SQL call — no separate reranking round-trip.
retriever = store.as_hybrid_retriever(k=4)
retriever.invoke("neural network training")
BM25 retrieval
Pure lexical ranking over the FTS-indexed text column.
retriever = store.as_bm25_retriever(k=4) # OR by default
retriever = store.as_bm25_retriever(k=4, mode="and") # require all terms
retriever.invoke("gradient descent")
Self-query
InfinoTranslator plugs into LangChain's SelfQueryRetriever, lowering an
LLM's structured query to a SQL WHERE over the declared metadata columns —
the full comparison and boolean surface, not a reduced DSL. Pass it as the
structured_query_translator (see LangChain's self-query docs for the
metadata_field_info setup):
from langchain_infino import InfinoTranslator
retriever = SelfQueryRetriever.from_llm(
llm,
store,
document_contents="research papers",
metadata_field_info=metadata_field_info,
structured_query_translator=InfinoTranslator(),
)
retriever.invoke("ML papers since 2023")
SQL-native search
The escape hatch for anything the typed methods don't cover — joins, custom
WHERE, or the vector_search / hybrid_search table functions. Project the
store's columns (doc_id, page_content, declared metadata,
_metadata_json, and optionally score) and the rows map back to
Documents.
qv = ",".join(map(str, embedding.embed_query("fox")))
store.search_by_sql(f"""
SELECT doc_id, page_content, _metadata_json, score
FROM hybrid_search('docs', 'page_content', 'fox', 'embedding', '{qv}', 10)
ORDER BY score DESC
""")
Semantic LLM cache
Caches model responses keyed by prompt meaning: a lookup embeds the prompt and returns a hit when a stored prompt for the same model lands within a distance threshold. One small Infino table, no extra infrastructure.
from langchain_core.globals import set_llm_cache
from langchain_infino import InfinoSemanticCache
set_llm_cache(InfinoSemanticCache(connection, embedding, dim=1536))
Async
The async methods (aadd_texts, asimilarity_search, …) are inherited from
VectorStore, which offloads the synchronous engine calls to a thread via
run_in_executor — the event loop is never blocked.
API reference
InfinoVectorStore(connection, table_name, embedding, *, dim, metric="cosine", text_column="page_content", vector_column="embedding", id_column="doc_id", metadata_columns=())— opens an existing table.from_texts(texts, embedding, metadatas=None, *, connection, table_name, dim, ids=None, metric="cosine", n_cent=64, text_column=..., vector_column=..., id_column=..., metadata_columns=()) -> InfinoVectorStore— creates and populates the table.add_texts(texts, metadatas=None, *, ids=None) -> list[str]— idempotent upsert.similarity_search(query, k=4, filter=None, *, filter_query=None, filter_column=None, filter_mode=None) -> list[Document]similarity_search_with_score(...),similarity_search_by_vector(...)max_marginal_relevance_search(query, k=4, fetch_k=20, lambda_mult=0.5, filter=None, ...)delete(ids) -> bool,get_by_ids(ids) -> list[Document]search_by_sql(sql) -> list[Document]as_retriever(...),as_hybrid_retriever(k=4),as_bm25_retriever(k=4, mode=None)
InfinoHybridRetriever,InfinoBM25Retriever—BaseRetrievers wrapping a store.InfinoTranslator—StructuredQuery→ SQL filter, forSelfQueryRetriever.InfinoSemanticCache(connection, embedding, *, dim, table_name="langchain_llm_cache", score_threshold=0.05)
metric is "cosine" (default), "l2sq" / "l2", or "negdot" / "dot".
See Infino for engine internals.
Development
make install # pip install -e ".[test,lint]"
make unit # unit tests (no engine)
make integration # integration + compliance tests (real Infino on a temp dir)
make lint type # ruff + mypy
make build # build sdist + wheel into dist/
make smoke # build the wheel, install it in a clean venv, run the smoke test
make clean # remove build artifacts and caches
License
Apache-2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_infino-0.1.0rc1.tar.gz.
File metadata
- Download URL: langchain_infino-0.1.0rc1.tar.gz
- Upload date:
- Size: 27.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
008821f02756494f55b2c34fe649967ec19646e039cef2612d7054857c494dc1
|
|
| MD5 |
99551b6dc0eccbe578aa8f851368fe5d
|
|
| BLAKE2b-256 |
0ee8827fb4dae461eaec64bd74b1b234f8d9857ce75b60dd04f0c83bf0dbf1b9
|
Provenance
The following attestation bundles were made for langchain_infino-0.1.0rc1.tar.gz:
Publisher:
publish.yml on infino-ai/langchain-infino
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_infino-0.1.0rc1.tar.gz -
Subject digest:
008821f02756494f55b2c34fe649967ec19646e039cef2612d7054857c494dc1 - Sigstore transparency entry: 1926127228
- Sigstore integration time:
-
Permalink:
infino-ai/langchain-infino@19297724763cb18eae93e77fd8ef8f01c0e58656 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/infino-ai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@19297724763cb18eae93e77fd8ef8f01c0e58656 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file langchain_infino-0.1.0rc1-py3-none-any.whl.
File metadata
- Download URL: langchain_infino-0.1.0rc1-py3-none-any.whl
- Upload date:
- Size: 21.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5db1f047a3005601d302d95f46729abc7a138ade16fe10b2bbcaae98b5de97cb
|
|
| MD5 |
6e59d28b34319e4e29bb27f468df8ccd
|
|
| BLAKE2b-256 |
7188b3a97f60aa5c9999cfadd393d8e301bd9e45374a88a349a70bc5c9e7e53b
|
Provenance
The following attestation bundles were made for langchain_infino-0.1.0rc1-py3-none-any.whl:
Publisher:
publish.yml on infino-ai/langchain-infino
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_infino-0.1.0rc1-py3-none-any.whl -
Subject digest:
5db1f047a3005601d302d95f46729abc7a138ade16fe10b2bbcaae98b5de97cb - Sigstore transparency entry: 1926127482
- Sigstore integration time:
-
Permalink:
infino-ai/langchain-infino@19297724763cb18eae93e77fd8ef8f01c0e58656 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/infino-ai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@19297724763cb18eae93e77fd8ef8f01c0e58656 -
Trigger Event:
workflow_dispatch
-
Statement type: