LangChain VectorStore integration for Pixeltable multimodal data infrastructure.
Project description
langchain-pixeltable
LangChain VectorStore integration backed by Pixeltable -- multimodal data infrastructure with built-in embedding indexes, metadata filtering, computed column lineage, and incremental computation.
Installation
pip install langchain-pixeltable
Quick Start
Works with any LangChain Embeddings model -- cloud or local:
from langchain_pixeltable import PixeltableVectorStore
from langchain_huggingface import HuggingFaceEmbeddings # no API key needed
vs = PixeltableVectorStore.from_texts(
texts=[
"Pixeltable handles multimodal data",
"LangChain builds LLM applications",
"Vector databases store embeddings",
],
embedding=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2"),
metadatas=[
{"category": "infra"},
{"category": "framework"},
{"category": "infra"},
],
table_name="mydir.docs",
)
# Similarity search
results = vs.similarity_search("multimodal data management", k=2)
for doc in results:
print(doc.page_content)
Filtered Similarity Search
The filter parameter maps to Pixeltable's .where() clause -- predicates are evaluated before ranking, so only matching rows participate in the similarity sort:
# Only search within "infra" documents
results = vs.similarity_search(
"data storage", k=5, filter={"category": "infra"},
)
# With scores
results = vs.similarity_search_with_score(
"embeddings", k=3, filter={"category": "infra"},
)
for doc, score in results:
print(f"[{score:.3f}] {doc.page_content}")
Access the Underlying Pixeltable Table
The .table property gives direct access to the Pixeltable table for operations beyond the VectorStore interface -- computed columns, lineage, version history, and arbitrary predicates:
import pixeltable as pxt
t = vs.table
# Inspect all data
t.select(t.text, t.metadata, t.embedding).collect()
# Add a computed column -- auto-backfills all existing rows
t.add_computed_column(word_count=my_word_counter(t.text))
# New inserts via the wrapper auto-compute lineage columns
vs.add_texts(["New document"], metadatas=[{"category": "infra"}])
# WHERE on computed columns + similarity
import numpy as np
sim = t.embedding.similarity(vector=np.array(query_vec, dtype=np.float32))
results = (
t.where(t.word_count > 5)
.order_by(sim, asc=False)
.limit(3)
.select(t.text, t.word_count, sim=sim)
.collect()
)
Connect to an Existing Pixeltable Table
Connect to any existing Pixeltable table -- including tables with multimodal columns like images or video:
vs = PixeltableVectorStore.from_existing_table(
table_name="mydir.existing_docs",
embedding=OpenAIEmbeddings(),
text_column="content",
embedding_column="content_embedding",
)
results = vs.similarity_search("search query", filter={"source": "arxiv"})
Use as a LangChain Retriever
retriever = vs.as_retriever(search_kwargs={"k": 5})
docs = retriever.invoke("What is Pixeltable?")
Why Pixeltable as a Vector Backend?
- Metadata filtering via
.where(): Filter on metadata fields before ranking, not post-hoc - Computed column lineage: Add derived columns that auto-backfill and auto-compute on new inserts
- Persistent and versioned: Data survives restarts; every change is tracked
- Incremental: Only new/changed rows get re-embedded
- Multimodal native: Images, video, audio, and documents alongside text
- Any embedding model: Works with OpenAI, Hugging Face, or any local model
- No external services: Embedded PostgreSQL, no Docker required
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_pixeltable-0.1.2.tar.gz.
File metadata
- Download URL: langchain_pixeltable-0.1.2.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9ef464c04c1d229eace3bf31944c75c19ffd7a46f477b9a61e5beb4ecacc21d
|
|
| MD5 |
fef1c5ecf76005dc821a8a41e35a05e1
|
|
| BLAKE2b-256 |
9f35050847978c977cc2d9297f163c6aad5ee38c719a7a15515d993c648e1d19
|
Provenance
The following attestation bundles were made for langchain_pixeltable-0.1.2.tar.gz:
Publisher:
release.yml on pixeltable/langchain-pixeltable
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_pixeltable-0.1.2.tar.gz -
Subject digest:
f9ef464c04c1d229eace3bf31944c75c19ffd7a46f477b9a61e5beb4ecacc21d - Sigstore transparency entry: 1638449774
- Sigstore integration time:
-
Permalink:
pixeltable/langchain-pixeltable@a632bdb723d75920af07f825c050e95e43e934e3 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/pixeltable
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a632bdb723d75920af07f825c050e95e43e934e3 -
Trigger Event:
release
-
Statement type:
File details
Details for the file langchain_pixeltable-0.1.2-py3-none-any.whl.
File metadata
- Download URL: langchain_pixeltable-0.1.2-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a63403ef657eb4724290b826943615248b5537250f1117bb7938401e364453c9
|
|
| MD5 |
9a0cba345c6122caec3379629e69a069
|
|
| BLAKE2b-256 |
4b2b1e337252a55a2502579680776ef8091cd834947f80bc776652b1bd73bed4
|
Provenance
The following attestation bundles were made for langchain_pixeltable-0.1.2-py3-none-any.whl:
Publisher:
release.yml on pixeltable/langchain-pixeltable
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langchain_pixeltable-0.1.2-py3-none-any.whl -
Subject digest:
a63403ef657eb4724290b826943615248b5537250f1117bb7938401e364453c9 - Sigstore transparency entry: 1638449882
- Sigstore integration time:
-
Permalink:
pixeltable/langchain-pixeltable@a632bdb723d75920af07f825c050e95e43e934e3 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/pixeltable
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a632bdb723d75920af07f825c050e95e43e934e3 -
Trigger Event:
release
-
Statement type: