Typed SQLAlchemy helpers for ParadeDB
Project description
Simple, Elastic-quality search for Postgres
Website • Docs • Community • Blog • Changelog
sqlalchemy-paradedb
ParadeDB integration for SQLAlchemy: typed helpers for BM25 indexes, search predicates, scoring, snippets, facets, and migration ergonomics.
Requirements & Compatibility
| Component | Supported |
|---|---|
| Python | 3.10+ |
| SQLAlchemy | 2.0.32+ |
| ParadeDB | 0.22.0+ |
| PostgreSQL | 15+ (with ParadeDB extension) |
Installation
uv add sqlalchemy-paradedb
For local development:
uv sync --extra test --extra dev
Quick Start
Prerequisites
Install pg_search in your Postgres database and connect SQLAlchemy to that database.
Create a BM25 Index
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing
products_bm25_idx = Index(
"products_bm25_idx",
indexing.BM25Field(Product.id),
indexing.BM25Field(
Product.description,
tokenizer=indexing.tokenize.unicode(lowercase=True),
),
indexing.BM25Field(
Product.category,
tokenizer=indexing.tokenize.literal(),
),
postgresql_using="bm25",
postgresql_with={"key_field": "id"},
)
For JSON columns named metadata, use metadata_ as the ORM attribute name.
Query with ParadeDB Predicates
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import pdb, search
stmt = (
select(Product.id, Product.description)
.where(search.match_any(Product.description, "running", "shoes"))
.order_by(pdb.score(Product.id).desc())
.limit(10)
)
with Session(engine) as session:
rows = session.execute(stmt).all()
Rows + Facets in a Single Query
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import facets, search
base = (
select(Product.id, Product.description)
.where(search.match_all(Product.description, "running"))
.order_by(Product.id)
.limit(10)
)
stmt = facets.with_rows(
base,
agg=facets.multi(
facets.value_count(field="id"),
facets.terms(field="category", size=10),
),
key_field=Product.id,
)
with Session(engine) as session:
rows = session.execute(stmt).all()
facet_payload = facets.extract(rows)
Search Patterns
Fuzzy Matching
from paradedb.sqlalchemy import search
search.term(Product.description, "shose", distance=1)
search.match_any(Product.description, "wirless", distance=1, prefix=True)
search.term(Product.description, "rnnuing", distance=1, transpose_cost_one=True)
Use fuzzy options on term, match_any, or match_all; there is no separate search.fuzzy(...) helper.
Phrase Prefix and More-Like-This
from paradedb.sqlalchemy import search
search.phrase_prefix(Product.description, ["running", "sh"])
search.more_like_this(Product.id, document_id=1, fields=["description"])
Proximity Composition
from sqlalchemy import select
from paradedb.sqlalchemy import search
prox = search.prox_array("running").within(1, search.prox_regex("sho.*"), ordered=True)
stmt = select(Product.id).where(search.proximity(Product.description, prox))
Indexing and Tokenizers
Tokenizer config can be expressed as a structured mapping:
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing
products_bm25_idx = Index(
"products_bm25_idx",
indexing.BM25Field(Product.id),
indexing.BM25Field(
Product.description,
tokenizer=indexing.tokenize.from_config(
{
"tokenizer": "simple",
"filters": ["lowercase", "stemmer"],
"stemmer": "english",
"alias": "description_simple",
}
),
),
indexing.BM25Field(
Product.description,
tokenizer=indexing.tokenize.from_config(
{
"tokenizer": "ngram",
"args": [3, 8],
"named_args": {"prefix_only": True},
"alias": "description_ngram",
}
),
),
postgresql_using="bm25",
postgresql_with={"key_field": "id"},
)
Validate that a field is indexed with the expected tokenizer:
from paradedb.sqlalchemy import indexing
indexing.assert_indexed(engine, Product.category, tokenizer="literal")
Inspect BM25 metadata for a mapped table:
from paradedb.sqlalchemy import indexing
meta = indexing.describe(engine, Product.__table__)
Alembic Operations
Import once in migration environment startup so Alembic registers ParadeDB operations:
import paradedb.sqlalchemy.alembic # noqa: F401
Use custom operations in migrations:
op.create_bm25_index(
"products_bm25_idx",
"products",
["id", "description"],
key_field="id",
table_schema="public",
)
op.reindex_bm25("products_bm25_idx", concurrently=True, schema="public")
op.drop_bm25_index("products_bm25_idx", if_exists=True, schema="public")
op.reindex_bm25(..., concurrently=True) must run outside a transaction (autocommit block).
Diagnostics Helpers
paradedb.sqlalchemy.diagnostics exposes wrapper functions for ParadeDB diagnostics:
from paradedb.sqlalchemy import diagnostics
indexes = diagnostics.paradedb_indexes(engine)
segments = diagnostics.paradedb_index_segments(engine, "products_bm25_idx")
check = diagnostics.paradedb_verify_index(engine, "products_bm25_idx", sample_rate=0.1)
all_checks = diagnostics.paradedb_verify_all_indexes(engine, schema_pattern="public")
Common Errors
with_rows requires ORDER BY
from sqlalchemy import select
from paradedb.sqlalchemy import facets
# Missing order_by(...)
base = select(Product.id).limit(10)
facets.with_rows(base, agg=facets.value_count(field="id"), key_field=Product.id)
with_rows requires LIMIT
from sqlalchemy import select
from paradedb.sqlalchemy import facets
# Missing limit(...)
base = select(Product.id).order_by(Product.id)
facets.with_rows(base, agg=facets.value_count(field="id"), key_field=Product.id)
with_rows requires a ParadeDB predicate
from sqlalchemy import select
from paradedb.sqlalchemy import facets
# ensure_predicate=False disables automatic search.all(...) injection
facets.with_rows(
select(Product.id).order_by(Product.id).limit(10),
agg=facets.value_count(field="id"),
key_field=Product.id,
ensure_predicate=False,
)
tokenizer config requires 'tokenizer'
from paradedb.sqlalchemy import indexing
indexing.tokenize.from_config({"filters": ["lowercase"]})
Examples
Documentation
- ParadeDB Official Docs: https://docs.paradedb.com
- ParadeDB Website: https://paradedb.com
Contributing
See CONTRIBUTING.md for development setup, running tests, linting, and the PR workflow.
Support
If you're missing a feature or have found a bug, please open a GitHub Issue.
To get community support, you can:
- Post a question in the ParadeDB Slack Community
- Ask for help on our GitHub Discussions
If you need commercial support, please contact the ParadeDB team.
License
sqlalchemy-paradedb is licensed under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sqlalchemy_paradedb-0.5.0.tar.gz.
File metadata
- Download URL: sqlalchemy_paradedb-0.5.0.tar.gz
- Upload date:
- Size: 64.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1ae1a770f4c3f74c1fe048c5da2f0bf1894a6505692d0e0c23c8b81229f9ea9
|
|
| MD5 |
dfbc5caea3d048474b5efc38ce925713
|
|
| BLAKE2b-256 |
441f7f393d6960fc895c4b822893785466dedbdf6d270dbff3e8f12f355b6326
|
File details
Details for the file sqlalchemy_paradedb-0.5.0-py3-none-any.whl.
File metadata
- Download URL: sqlalchemy_paradedb-0.5.0-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d18bc09a2a64b5b64d07957dd05dcb0ead0e36667660813f0e0149263104ad52
|
|
| MD5 |
a925fefc8b577601955cd1ec699c346a
|
|
| BLAKE2b-256 |
6370c01412a2f060a56aab8201a69022ed56689e74adeeb44e3aa0143f5689bb
|