Zero-config semantic search for any MongoDB database.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

varmabudharaju

These details have not been verified by PyPI

Project description

mongosemantic

Zero-config semantic search for any MongoDB database.

Semantic search in the web UI — a natural-language query over 23k movie plots returns Cold-War spy films with scores, score bars, and CSV/JSONL/JSON export

A meaning-only query — none of these results contain the words "spies" or "blackmail" as keywords. 17 ms over 45k embedded chunks via the embedded HNSW index, on a plain self-hosted replica set.

What is it

mongosemantic is a Python toolkit — CLI, web dashboard, and MCP server — that adds search-by-meaning to the MongoDB you already run. Point it at a database, pick a text field, and it embeds your documents with a local model, keeps the embeddings in sync as your data changes, and answers natural-language queries ("a washed-up boxer gets one last shot at redemption") from the CLI, a browser, or directly to AI agents over MCP.

No separate vector database. No ETL. No embedding API bill. Works on Atlas, self-hosted replica sets, and standalone MongoDB 7.0+.

Why it exists

Adding semantic search to an existing MongoDB app today usually means one of three things:

Bolt on a vector database (Pinecone, Weaviate, Qdrant, …) — new infrastructure, an ETL pipeline to keep two systems consistent, and a second source of truth to operate forever.
Use Atlas Vector Search directly — solid when you're on Atlas with Search-index slots to spare, but the embedding pipeline is still on you: chunking, batching, re-embedding on update, model upgrades. The managed alternatives are constrained — Atlas auto-embedding is a metered preview tied to one model family, and the Community 8.2 search preview requires running a separate mongot binary.
Send documents to an embedding API — your data leaves your machines, and you pay per token, forever.

mongosemantic is the missing pipeline-and-product layer on top of the database you already have: free local embeddings (your data never leaves the box), any MongoDB 7.0+ topology with zero extra infrastructure, and all the unglamorous parts handled — change-stream sync, a self-healing job queue, chunking, index management, online model migration. Search by meaning is one apply away.

Why you'd pick it

Local-first and free — embeddings are computed on your machine with sentence-transformers; documents never leave your network and there is no per-token meter. OpenAI/Ollama models are opt-in, not required.
Any MongoDB, zero new infra — Atlas, replica set, or standalone 7.0+. No vector database, no sidecar process, no ETL. Embeddings live in MongoDB next to your data (a shadow collection, or inline on the doc).
Fast without Atlas — an embedded HNSW index makes self-hosted search ~15 ms over 45k chunks instead of a 2.5 s brute-force scan.
Search quality built in — hybrid semantic+keyword search on every topology, metadata filters that need no reindex, and a local cross-encoder reranker. Capabilities that usually require Atlas Search tiers or external services, all local.
Never stale — change streams (or polling on standalone) re-embed documents as they change; a self-healing job queue reclaims stalled work automatically.
Model freedom — five models from free-local to OpenAI, and an online migration command that swaps a collection to a new model with near-zero downtime and a rollback archive.
Three interfaces, one state — CLI for scripts, a web dashboard for humans, and an MCP server so Claude Desktop / Cursor / any AI agent can query your data by meaning. All share the same saved connection.

How to use it

pip install mongosemantic

export MONGOSEMANTIC_URI="mongodb+srv://user:pass@cluster.mongodb.net/my_db"
export MONGOSEMANTIC_DB="my_db"

mongosemantic inspect --collection articles        # 1. score fields for suitability
mongosemantic apply   --collection articles --field body   # 2. configure + create indexes
mongosemantic index   --collection articles        # 3. bulk-embed existing docs
mongosemantic worker &                             # 4. keep embeddings in sync
mongosemantic search  "budget travel"              # 5. search by meaning

That's the whole loop: inspect tells you which fields are worth embedding, apply configures the collection (and creates Atlas indexes where applicable), index enqueues existing documents, the worker embeds them and stays running to catch changes, and search queries by meaning — with --filter, --rerank, and --hybrid available from day one.

Prefer a UI? The dashboard does everything the CLI does, plus observability:

mongosemantic ui                          # http://127.0.0.1:8080

It runs an embedded worker, so ui alone is a complete deployment. Localhost-bound by default with CSRF protection, rate limiting, and security headers — bind to a non-loopback address only behind your own auth proxy.

Wiring it into an AI agent is one command:

mongosemantic integrate claude          # writes Claude Desktop config (restart Claude)
mongosemantic serve --transport sse     # or run as a standalone SSE server on :8090

Want data to try it on? See Demo data below — a seeded movies collection makes every example in the next section reproducible.

See it in action

A tour of every feature, in the order you'd meet them. Every screenshot is real and reproducible — .capture.yaml defines each shot and capture run regenerates the full set against a seeded database. docs/test-evidence-0.9.md collects the feature-by-feature proof for the 0.9.0 search upgrades.

Connect — with topology detection

Paste a MongoDB URI (or rely on env vars) and mongosemantic detects what it's talking to — Atlas, self-hosted replica set, or standalone — and adapts everything downstream: which search engine to use, whether sync can use change streams or must poll, which indexes to create. The connection is saved once and shared by the CLI, the dashboard, and the MCP server.

Connection page — connected to a self-hosted replica set with topology detection, saved-connection card, and developer help

Browse collections

Every collection in the database, with its configured/not-configured status at a glance. Configured collections show their embedding model and storage mode (shadow or inline); the rest are one click away from setup. The migrate action lives here too.

Collections browser — configured collections show model and storage mode; the rest are one click from setup

Inspect fields before you commit

Embedding the wrong field wastes hours of compute. The inspect page scores every field of a collection for semantic-search suitability — text length, fill rate, type consistency — and shows sample documents underneath, so you pick the right fields the first time. Also available as mongosemantic inspect -c <coll> and the inspect_collection MCP tool.

Inspect page — every field of the movies collection scored for semantic-search suitability, with sample documents below

Configure semantic search

One form: discovered fields as checkboxes (with their suitability badges), shadow vs. inline storage, optional chunking for long documents, and the embedding model. On Atlas, apply auto-creates the vector and BM25 Search indexes; on every topology it creates the $text index that powers the hybrid keyword leg.

Watch the indexing pipeline

Bulk-embedding 23k documents shouldn't be a black box. The indexing dashboard shows completed / in-flight / pending / failed tiles, a live worker heartbeat, per-field progress bars, and a recent-activity feed — with retry and reindex one click away. The job queue is self-healing: stale in-flight jobs are reclaimed and dead worker heartbeats pruned automatically.

Indexing dashboard — completed/in-flight/pending/failed tiles, live worker dot, per-field progress, activity feed

Search by meaning

The core feature, everywhere you work. The web version is the hero shot at the top of this page; here is the same engine from the CLI — a meaning-only query finding Cold-War spy thrillers with no keyword overlap:

CLI semantic search over 23k movie plots — finds Cold-War spy thrillers from a meaning-only query

Open the full document behind a result

Search results show the matched chunk and its score; clicking any row slides in the complete source document, so you never lose the connection between a match and the record it came from.

Click any search result to slide in the full source document

Filter with plain MongoDB queries

Narrow any semantic search with a regular MongoDB query over the source documents — no reindex, no schema change, works on every search path:

mongosemantic search "a detective hunting a serial killer" -c movies \
  --filter '{"year": {"$lt": 1960}}'

The shot below is exactly that: the same noir query, constrained to pre-1960 — every modern serial-killer film drops out and the classics surface. Local paths (brute-force, embedded HNSW) pre-filter the matching _ids and are exact; Atlas paths over-fetch ×5 and post-match. $where/$function/$accumulator/$text/$expr are rejected; invalid filters error loudly (exit 2 on the CLI, HTTP 400 in the web UI).

Metadata filtering — the same noir query constrained to year < 1960 with a plain MongoDB filter; only pre-1960 classics return

Rerank with a local cross-encoder

Vector similarity is a great first pass but a mediocre judge of final order. --rerank turns search into two-stage retrieval: over-fetch limit×5 candidates, re-score each (query, chunk) pair with a local cross-encoder (cross-encoder/ms-marco-MiniLM-L-6-v2, ~80 MB, CPU, loaded once per process), and return the top hits — original similarity kept as vector_score:

mongosemantic search "a washed-up boxer gets one last shot at redemption" \
  -c movies --rerank

In the shot: the Reranked badge on every row, the decisive winner (an actual washed-up-boxer redemption plot), and score bars normalized per result set. A bonus: cross-encoder scores are comparable across collections, even ones embedded with different models.

Cross-encoder reranking — top candidates re-scored locally; every hit carries a reranked badge and bars are normalized per result set

Hybrid search — on every topology

Combine semantic similarity with keyword matching for queries that mix meaning and specific terms — "MongoDB 7.0 replica set issues" benefits from semantic (catches "replica set" → "replication") plus keyword (anchors on "7.0").

mongosemantic search "Godzilla attacks a city" -c movies --hybrid

Two paths, picked automatically:

Atlas with Search indexes — native $rankFusion over $vectorSearch plus BM25 $search; the search path verifies both indexes actually exist before relying on them.
Everywhere else — self-hosted 7.0+ (standalone or replica set), and Atlas clusters whose Search indexes are cap-blocked (e.g. the free-tier 3-index budget): client-side reciprocal-rank fusion over a classic MongoDB $text index on the shadow's chunk text plus the vector leg, with the same 1/(60+rank), 0.6/0.4 weighting as $rankFusion.

The shot below is hybrid running against a plain self-hosted replica set — no Atlas anywhere. Inline-mode collections fall back to pure semantic with a clear notice. Filter, rerank, and hybrid all compose, and all three are available in the CLI, the web UI, and the MCP tools.

Hybrid search on a self-hosted replica set — semantic + $text keyword legs fused with client-side RRF; no Atlas required

Keep an eye on the whole system

The overview dashboard answers "is everything healthy?" in one screen: detected topology, total embeddings, job-queue depth and failures, per-collection indexing activity, and live worker heartbeats.

Overview dashboard — topology, embedding totals, job-queue health, per-collection indexing activity

Explore the embedding space

A 2D PCA projection of sampled embeddings with K-means clusters and TF-IDF keyword labels per cluster — the fastest way to sanity-check that your embeddings actually capture structure (movie genres, product categories) before you build on them. Click any point to inspect the document behind it.

Explore embeddings — K-means clusters over a 2D PCA projection, TF-IDF keyword labels per cluster

Run safe aggregations

A read-only pipeline runner for poking at your data without leaving the dashboard: quick examples, table/JSON views, stats line, CSV/JSON export. Hard-capped at 10 s and 100 documents, with $out/$merge/$function blocked — safe to expose to teammates.

Read-only aggregation runner — quick examples, table view, stats line, CSV/JSON export

Migrate models with near-zero downtime

Embedding models improve; your search should too. migrate re-embeds a shadow-mode collection with a new model into a temp shadow, then swaps it into place with an atomic renameCollection — search serves the old model up to the swap instant, the new model immediately after. The previous shadow is archived (articles_embeddings_archive_{timestamp}) for rollback; drop it with --drop-archive once verified.

mongosemantic migrate --collection articles --model local-better

Migrate model — per-collection embedding-model swap with near-zero downtime; the old shadow is archived for rollback

Let AI agents query your MongoDB over MCP

mongosemantic integrate claude writes the Claude Desktop config in one command; serve runs the same server over stdio or SSE for Cursor and any other MCP client. Agents get meaning-based search over your data — with filter, rerank, and hybrid — plus safe read-only introspection tools.

MCP page — one command wires mongosemantic into Claude Desktop; eleven tools exposed to any MCP client

Eleven tools are exposed:

Tool	What it does
`semantic_search`	Find rows in one collection by meaning; optional `filter` (MongoDB query) and `rerank` params
`hybrid_search`	Semantic + keyword, fused via Atlas `$rankFusion` or client-side RRF on every other topology; takes `filter` / `rerank` too
`search_all_collections`	Cross-collection fanout, merged by score; `rerank` makes scores comparable across models
`list_collections`	Every collection + its configured/not-configured status
`list_configured`	Just the ones with semantic search wired up
`inspect_collection`	Field-by-field suitability scoring
`get_sample_documents`	Real rows, embedding sub-doc stripped
`get_status`	Topology + total embeddings + job-queue counts
`safe_aggregation`	Read-only pipeline runner (10s, 100-row, no `$out`/`$merge`/`$function`)
`get_schema_context`	Compact schema summary for AI-generated aggregations
`migrate_model`	Switch a collection's embedding model with near-zero downtime

Status (v0.9.0)

Connect to Atlas / replica set / standalone — saved connection shared by UI, CLI, and MCP server
Inspect a collection, score fields for suitability
Configure shadow-mode or inline-mode semantic search on one or more fields
Real chunking — long documents split into overlapping chunks, search ranks per chunk
Bulk-embed existing documents
Sync in real time (change streams) or on a schedule (polling)
Search via native Atlas $vectorSearch, embedded HNSW (non-Atlas), or brute-force aggregation
CLI: inspect / apply / index / search / worker / status / retry / reindex / reindex-hnsw / migrate / teardown / ui / serve / integrate
Web UI — connection, collections, inspect, configure, indexing, search, query, dashboard, visualize, MCP, guide
Embedded worker — mongosemantic ui alone keeps embeddings in sync; no second terminal
Self-healing job queue — stale in-flight jobs reclaimed, dead worker heartbeats pruned automatically
MCP server for Claude Desktop / Cursor / any MCP client (stdio + SSE)
Hybrid search on every topology — Atlas $rankFusion (live-verified on 8.0.24) or client-side RRF with a $text index elsewhere (--hybrid / UI toggle / hybrid_search MCP tool)
Metadata filtering — plain MongoDB queries over source fields on every search path (--filter / UI input / MCP filter param), no reindex needed
Local cross-encoder reranking — two-stage retrieval with ms-marco-MiniLM-L-6-v2 (--rerank / UI toggle / MCP rerank param)
Online model migration — mongosemantic migrate + migrate_model MCP tool, atomic renameCollection swap
Visualize page — K-means clusters over a 2D PCA projection, TF-IDF keyword labels, click-to-inspect
Search & query export — CSV / JSONL / JSON from the search page, CSV / JSON from the aggregation runner

Known limitations

Free-tier Atlas (M0/M2/M5) caps search indexes at 3 per cluster. Each shadow-mode field costs 2 (vectorSearch + BM25), each inline field 1. apply and migrate detect the cap and degrade gracefully — cap-blocked collections keep hybrid search via client-side RRF over a classic $text index, so keyword matching survives the cap. The Atlas paths — $vectorSearch, hybrid $rankFusion, migration index carry-over — are live-verified against a free-tier M0 on MongoDB 8.0.24; see docs/atlas-setup.md for the runbook. Inline-mode with a real Atlas vector index is the one path verified only through its brute-force fallback (the M0 cap leaves it no index slot).
Atlas-path filters are over-fetch + post-match. On $vectorSearch paths a --filter is applied after the source $lookup, over a ×5 over-fetched candidate set — a highly selective filter may return fewer than limit rows. Local paths (brute-force, embedded HNSW) pre-filter the matching _ids and are exact.

Embedding models

Model	Dimensions	Cost	Notes
`local-fast` (MiniLM)	384	Free	Default. Runs on your machine.
`local-better` (MPNet)	768	Free	Higher quality, slower.
`openai-small`	1536	~$0.02/1M tokens	Multilingual.
`openai-large`	3072	~$0.13/1M tokens	Highest quality.
`ollama-nomic`	768	Free	Self-hosted via Ollama.

Select via MONGOSEMANTIC_MODEL or --model on apply.

Deployment topologies

Topology	Sync	Search (shadow mode)	Search (inline mode)	Realistic scale
Atlas	Change streams	`$vectorSearch` (HNSW, native)	`$vectorSearch`	Millions
Self-hosted replica set	Change streams	Embedded HNSW (in-process)	Brute-force aggregation	Hundreds of thousands
Self-hosted standalone	Polling (`updated_at` watermark)	Embedded HNSW (in-process)	Brute-force aggregation	Hundreds of thousands

Embedded HNSW: when you run mongosemantic ui against a non-Atlas cluster, an HNSW graph is built from the shadow collection in a background thread and persisted under ~/.cache/mongosemantic/hnsw/. Queries hit the graph at ~O(log N) — ~15 ms warm on 45k chunks vs ~2.5 s brute-force. Indexes rebuild automatically when enough rows go stale; force a rebuild with mongosemantic reindex-hnsw --all.

Inline-mode collections still take the brute-force path on non-Atlas (HNSW for inline is a follow-up). For datasets in the hundreds of thousands, prefer shadow mode or Atlas.

Development

git clone https://github.com/varmabudharaju/mongosemantic
cd mongosemantic
pip install -e ".[dev,openai]"
docker compose up -d                          # replica set + standalone
MONGOSEMANTIC_RUN_INTEGRATION=1 python3 -m pytest -v

The README screenshots are reproducible: .capture.yaml at the repo root defines every shot (real Chromium renders of the dashboard, real Terminal captures of the CLI). Regenerate them with capture run against a seeded database.

Demo data

Two seed scripts ship with the repo:

# Small hand-curated corpus (~185 articles + 38 products + 10 recipes).
# Fast, offline, good for fast iteration.
python3 scripts/seed_demo.py

# MongoDB's official sample_mflix — 23,539 movies with plots, genres, cast.
# ~40 MB download, ideal for realistic semantic-search demos.
python3 scripts/seed_mflix.py

After seeding either dataset:

# For mflix:
mongosemantic apply  -c movies -f title -f plot
mongosemantic index  -c movies
mongosemantic worker --once     # processes all pending jobs, then exits
mongosemantic search "spies blackmail and intrigue in cold war Berlin" -c movies

Project docs

If you want to dig in further:


`docs/ARCHITECTURE.md`	Module map, data flow diagrams, storage layout, key design decisions. The technical reference.
`docs/HANDOFF.md`	Current state: what's working, what's not live-tested, what was intentionally left out, what's worth shipping next.
`CONTRIBUTING.md`	Dev setup, test strategy, where to add a new CLI command / embedding provider / web route / MCP tool / search mode.
`docs/atlas-setup.md`	10-minute runbook for verifying the Atlas-specific paths (`$vectorSearch`, hybrid `$rankFusion`, migration index name carry-over) against a free-tier M0 cluster.
`CHANGELOG.md`	Per-version summary of what landed and why.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

varmabudharaju

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.9.0

Jun 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mongosemantic-0.9.0.tar.gz (225.7 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mongosemantic-0.9.0-py3-none-any.whl (167.6 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file mongosemantic-0.9.0.tar.gz.

File metadata

Download URL: mongosemantic-0.9.0.tar.gz
Upload date: Jun 25, 2026
Size: 225.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mongosemantic-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`15f63837c4b715bf53db8b7199db80e6e454fe16267b68764f0d633b04fe307a`
MD5	`e05237a71d65dd4ca797901d5fb2407c`
BLAKE2b-256	`3fae26c4967b839e576008f06e73c0641f97f8d11e342ad563c0985578046fa5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mongosemantic-0.9.0.tar.gz:

Publisher: publish.yml on varmabudharaju/mongosemantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mongosemantic-0.9.0.tar.gz
- Subject digest: 15f63837c4b715bf53db8b7199db80e6e454fe16267b68764f0d633b04fe307a
- Sigstore transparency entry: 1945538070
- Sigstore integration time: Jun 25, 2026
Source repository:
- Permalink: varmabudharaju/mongosemantic@46870acb903bd76c1db64744e4246a7414ef8153
- Branch / Tag: refs/heads/main
- Owner: https://github.com/varmabudharaju
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@46870acb903bd76c1db64744e4246a7414ef8153
- Trigger Event: workflow_dispatch

File details

Details for the file mongosemantic-0.9.0-py3-none-any.whl.

File metadata

Download URL: mongosemantic-0.9.0-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 167.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mongosemantic-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`297a13ff64c6fab2285a8c35c2c9cb47621d6872a17932ad45ac7a69964102c3`
MD5	`6cbc103bddd9c0d87490ffc398584090`
BLAKE2b-256	`180e0f0b7c7165876af84e20e9a389c3f9c41a600165e41fb6851d8abb87cf4b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mongosemantic-0.9.0-py3-none-any.whl:

Publisher: publish.yml on varmabudharaju/mongosemantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mongosemantic-0.9.0-py3-none-any.whl
- Subject digest: 297a13ff64c6fab2285a8c35c2c9cb47621d6872a17932ad45ac7a69964102c3
- Sigstore transparency entry: 1945538105
- Sigstore integration time: Jun 25, 2026
Source repository:
- Permalink: varmabudharaju/mongosemantic@46870acb903bd76c1db64744e4246a7414ef8153
- Branch / Tag: refs/heads/main
- Owner: https://github.com/varmabudharaju
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@46870acb903bd76c1db64744e4246a7414ef8153
- Trigger Event: workflow_dispatch

mongosemantic 0.9.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

mongosemantic

What is it

Why it exists

Why you'd pick it

How to use it

See it in action

Connect — with topology detection

Browse collections

Inspect fields before you commit

Configure semantic search

Watch the indexing pipeline

Search by meaning

Open the full document behind a result

Filter with plain MongoDB queries

Rerank with a local cross-encoder

Hybrid search — on every topology

Keep an eye on the whole system

Explore the embedding space

Run safe aggregations

Migrate models with near-zero downtime

Let AI agents query your MongoDB over MCP

Status (v0.9.0)

Known limitations

Embedding models

Deployment topologies

Development

Demo data

Project docs

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance