agent-native SQLite vector retrieval
Project description
flexvec
Composable vector retrieval with SQL.
flexvec is a Python library and agent-native CLI that reshapes vector search scores before selection. Suppress a topic, weight by recency, spread across subtopics, project a direction through embedding space — all in one SQL statement. Runs in-process on SQLite with optional local MCP serving. No hosted service.
pip install flexvec
pip install "flexvec[mcp]" # agent MCP server + embeddings
Agent-native SQLite vectorization
FlexVec can turn an existing SQLite table into an agent-ready retrieval surface. Commands are JSON-first so agents can inspect, prepare, index, verify, and serve one database deterministically.
flexvec inspect app.db --json
Create a retrieval contract:
{
"table": "docs",
"id_column": "id",
"text_columns": ["title", "body"],
"metadata_cols": ["created_at"]
}
Prepare and index the database:
flexvec prepare app.db --spec spec.json --json
flexvec index app.db --spec spec.json --json
flexvec doctor app.db --json
Query or hand the DB to an agent over MCP:
flexvec sql app.db "SELECT v.id, v.score, c.content FROM vec_ops('similar:refund policy') v JOIN _raw_chunks c ON c.id = v.id LIMIT 10" --json
flexvec sql app.db "SELECT k.id, k.rank, c.content FROM keyword('refund policy') k JOIN _raw_chunks c ON c.id = k.id LIMIT 10" --no-embed --json
flexvec mcp app.db
prepare creates FlexVec-owned _flexvec_meta, _raw_chunks, and chunks_fts
surfaces inside the target database. FlexVec does not require a Flex registry,
Flex cells, services, modules, or Labs packages.
If those tables already exist, prepare/index return warnings before reusing
or rebuilding them; copy the DB first or choose custom table names in the spec
when reuse is not intended.
Getting started
Your table
Any SQLite database with an embedding column works.
CREATE TABLE chunks (
id TEXT PRIMARY KEY,
content TEXT,
embedding BLOB -- float32, L2-normalized
);
Connect
Load embeddings into memory once. Every query after that is a matmul.
import sqlite3
from flexvec import VectorCache, register_vec_ops, execute, get_embed_fn
db = sqlite3.connect("my.db")
cache = VectorCache()
cache.load_from_db(db, "chunks", "embedding", "id")
register_vec_ops(db, {"chunks": cache}, get_embed_fn())
Search
Write SQL. flexvec handles the vector math behind the scenes.
rows = execute(db, """
SELECT v.id, v.score, c.content
FROM vec_ops('similar:authentication patterns') v
JOIN chunks c ON v.id = c.id
ORDER BY v.score DESC LIMIT 5
""")
Examples
Suppress and diversify
Find authentication patterns without drowning in deployment and testing discussions.
SELECT v.id, v.score, c.content
FROM vec_ops(
'similar:authentication patterns
diverse suppress:deployment suppress:testing',
'SELECT id FROM chunks WHERE length(content) > 200') v
JOIN chunks c ON v.id = c.id
ORDER BY v.score DESC LIMIT 10
suppress: pushes deployment and testing content out of the results. diverse spreads across subtopics instead of returning ten variations of the same match. The pre-filter scopes to chunks over 200 characters — cutting out noise before anything gets scored.
Hybrid retrieval
Find the session where you actually fixed that OOM error — not just the logs.
SELECT k.id, k.rank, v.score, c.content
FROM keyword('OOM') k
JOIN vec_ops('similar:memory limit debugging worker crash fix') v ON k.id = v.id
JOIN chunks c ON k.id = c.id
ORDER BY v.score DESC LIMIT 10
keyword('OOM') finds every chunk containing the term. vec_ops() scores by relevance to debugging and fixing. The JOIN keeps only chunks that match both — exact term plus semantic relevance.
Tokens
Tokens reshape scores. They compose freely in a single string.
| token | what it does |
|---|---|
similar:TEXT |
search for this concept |
suppress:TEXT |
push this topic out of results (stackable) |
diverse |
spread across subtopics instead of ten versions of the same answer |
decay:N |
favor recent content — N-day half-life |
centroid:id1,id2 |
"more like these" — search from the average of examples |
from:A to:B |
find content along a conceptual arc |
pool:N |
how many candidates to score (default 500) |
'similar:auth diverse suppress:oauth decay:7' — four operations, one query.
How it works
Every query runs three phases in one SQL statement.
SQL pre-filter → numpy modulation → SQL compose
- SQL pre-filter narrows what enters scoring — by date, type, length, or any SQL expression.
- numpy modulation scores candidates and reshapes the score array with tokens before selection.
- SQL compose joins results back to your tables for grouping, filtering, or reranking.
The database is never modified. Results materialize as a temp table that SQL composes over.
Performance
No index. Brute-force matmul on a numpy matrix.
| corpus | matmul | full pipeline |
|---|---|---|
| 250K | 5ms | 19ms |
| 500K | 7ms | 37ms |
| 1M | 17ms | 82ms |
128 dimensions, Nomic Embed v1.5 (Matryoshka). Pre-filtering narrows candidates before the matmul — scoped queries run in single-digit ms.
Install
pip install flexvec # core (numpy only)
pip install flexvec[embed] # + ONNX embedder
pip install flexvec[mcp] # + MCP server and embedder
See also
- arXiv paper — architecture and evaluation
- flex — search and retrieval for AI agents (uses flexvec)
- getflex.dev
MIT · Python 3.10+ · SQLite · numpy
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flexvec-0.4.0.tar.gz.
File metadata
- Download URL: flexvec-0.4.0.tar.gz
- Upload date:
- Size: 374.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95b986b150fc13c03a85566755ead2c3bd09284d36db05e259729d62cd691650
|
|
| MD5 |
b0252f1585a69af2d9b5e565bae58036
|
|
| BLAKE2b-256 |
5ae68233cbc9f3262b50c80919f6171e0174e626d800f7064ee6d952bdef5f6d
|
Provenance
The following attestation bundles were made for flexvec-0.4.0.tar.gz:
Publisher:
publish.yml on damiandelmas/FlexVec
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flexvec-0.4.0.tar.gz -
Subject digest:
95b986b150fc13c03a85566755ead2c3bd09284d36db05e259729d62cd691650 - Sigstore transparency entry: 1408743639
- Sigstore integration time:
-
Permalink:
damiandelmas/FlexVec@6989bc82ea2925b59f6a141407a1b8a2fa4cc1af -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/damiandelmas
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6989bc82ea2925b59f6a141407a1b8a2fa4cc1af -
Trigger Event:
push
-
Statement type:
File details
Details for the file flexvec-0.4.0-py3-none-any.whl.
File metadata
- Download URL: flexvec-0.4.0-py3-none-any.whl
- Upload date:
- Size: 363.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c0287f580a2eeb2d4fa1c192d9f7d7184dded294f88d6eead3acab8df06f59d
|
|
| MD5 |
4b96ffd3db23b45fce54bffcf656a701
|
|
| BLAKE2b-256 |
392610f954713a19250acb0155a2006b7d934c7ffc6dda2f1caa4781976a55b5
|
Provenance
The following attestation bundles were made for flexvec-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on damiandelmas/FlexVec
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flexvec-0.4.0-py3-none-any.whl -
Subject digest:
2c0287f580a2eeb2d4fa1c192d9f7d7184dded294f88d6eead3acab8df06f59d - Sigstore transparency entry: 1408743757
- Sigstore integration time:
-
Permalink:
damiandelmas/FlexVec@6989bc82ea2925b59f6a141407a1b8a2fa4cc1af -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/damiandelmas
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6989bc82ea2925b59f6a141407a1b8a2fa4cc1af -
Trigger Event:
push
-
Statement type: