SQL Backend Library for AI, Datatables and Frontend Apps

Project description

filtersql

A Python library to build parameterized SQL queries from plain dicts. I use it mainly for three things:

DataTables backends;
cursor-based pagination (similar to how Microsoft Access works);
as a bridge between LLM-generated JSON and the database.

Supports PostgreSQL, SQLite, MySQL and Oracle.

Install

pip install filtersql

Quick start

from filtersql import Datasource

columns = [
    {'field': 'id'},
    {'field': 'first_name'},
    {'field': 'last_name'},
    {'field': 'company_name'},
]

filters = [
    {'field': 'first_name', 'operator': 'icontains', 'value': 'John'},
    {'field': 'last_name',  'operator': 'icontains', 'value': 'Smith'},
]

q = Datasource(
    source  = 'sample_uk',
    columns = columns,
    dbms    = 'Pg',
)

query, values = q.select(
    filters = filters,
    order   = [{'field': 'first_name', 'order': 'asc'}],
    limit   = {'start': 0, 'length': 10},
)

print(query)
# select
#   "id",
#   "first_name",
#   "last_name",
#   "company_name"
# from
#   "sample_uk"
# where
#   ("first_name" ilike '%' || ? || '%' and "last_name" ilike '%' || ? || '%')
# order by
#   "first_name" asc
# limit 10 offset 0

print(values)
# ['John', 'Smith']

rows = execute_db(query, values)

Using filtersql with AI / LLM pipelines

The idea is simple: the LLM parses the user question and returns a JSON with the filters, Pydantic validates the structure, and filtersql builds the SQL. The values are always passed as parameters so there is no risk of SQL injection even if the model produces unexpected output.

Basic flow

user question â†’ LLM â†’ JSON â†’ Pydantic â†’ filtersql â†’ parameterized SQL

Define the Pydantic schema

You tell the LLM exactly which fields and operators it can use. Anything outside the list gets rejected by Pydantic before it reaches the database.

from pydantic import BaseModel, Field
from typing import List, Literal

class SQLFilterSchema(BaseModel):
    field: Literal[
        'doc_type',
        'doc_date',
        'author',
        'title',
        'attributes->>supplier',   # JSONB path
        'attributes->>amount',     # JSONB path
    ] = Field(description="Column name or JSONB path")
    operator: Literal[
        '=', '!=', '>', '>=', '<', '<=',
        'icontains', 'istarts_with',
        'in', 'notin', 'null', 'notnull'
    ] = Field(description="Comparison operator")
    value: str = Field(description="Value to search for")

class QuerySchema(BaseModel):
    semantic_query: str = Field(description="Cleaned query for vector/FTS search")
    sql_filters: List[SQLFilterSchema] = Field(description="SQL filters")

Call the LLM with structured output

ai_response = ai_client.models.generate_content(
    model='gemini-flash',
    contents=user_question,
    config=GenerateContentConfig(
        system_instruction=YOUR_INTENT_PARSER_PROMPT,
        response_mime_type="application/json",
        response_schema=QuerySchema,
        temperature=0.0,
    )
)
parsed = QuerySchema.model_validate_json(ai_response.text)

Build the WHERE clause

from filtersql import Datasource

filters = [f.model_dump() for f in parsed.sql_filters]

ds = Datasource(
    source      = 'documents',
    columns     = columns,
    dbms        = 'Pg',
    placeholder = '%s',
    filters     = filters,
)

# if you need only the WHERE part to append to your own query:
where_clause, values = ds.where()
# â†’ " and ("doc_type"=%s and "doc_date">=%s and "doc_date"<=%s)"
# â†’ ['CONTRACT', '2025-01-01', '2025-12-31']

# or the full SELECT:
query, values = ds.select()

A few things that help in practice

Pass the current date in the prompt. If you don't, the model tends to guess on things like "this year" or "recent documents" and gets it wrong.

Give explicit value mappings. For example tell the model that "contract" must become doc_type = 'DB_CONTRACT' exactly. Free-form values produce inconsistent filters.

Separate semantic from deterministic. Ask the model to put dates, document types and identifiers in sql_filters, and leave the conceptual topic in semantic_query for vector or full-text search. Mixing them in the embedding adds noise.

Always add a fallback. If the LLM call fails, fall back to empty filters and use the raw question for semantic search.

try:
    parsed = QuerySchema.model_validate_json(ai_response.text)
except Exception:
    parsed = QuerySchema(semantic_query=user_question, sql_filters=[])

JSONB fields and value_type

When a JSONB field contains numeric or date values, you need to tell Postgres the type explicitly or it will compare as text. Add value_type to the filter:

{'field': 'attributes->>amount', 'operator': '>=', 'value': '10000', 'value_type': 'numeric'}
# â†’ ("attributes"->>'amount')::numeric >= %s::numeric

{'field': 'attributes->>expiry_date', 'operator': '<=', 'value': '2025-12-31', 'value_type': 'date'}
# â†’ ("attributes"->>'expiry_date')::date <= %s::date

Without the cast, '9' > '10' in Postgres text comparison, which gives wrong results for numeric ranges.

Flask + DataTables example

from filtersql import Datasource
from filtersql.utils import parseDatatableArgs

@app.route("/api/table.json")
def table_json():
    a = parseDatatableArgs(request.args)

    columns = [
        {'field': 'id'},
        {'field': 'first_name'},
        {'field': 'last_name'},
        {'field': 'company_name'},
    ]

    # build filters from DataTable per-column search
    filters = []
    for k in a['columns'].values():
        if k['search']['value']:
            col = columns[int(k['data'])]
            filters.append({
                'field':    col['field'],
                'operator': 'icontains',
                'value':    k['search']['value'],
            })

    # build order from DataTable
    order = [
        {'field': columns[int(k['column'])]['field'], 'order': k['dir']}
        for k in a['order'].values()
    ]

    q = Datasource(
        source      = 'sample_uk',
        columns     = columns,
        dbms        = 'SQLite',
        placeholder = '?',
    )

    query, values = q.select()
    records_total = execute_db("select count(*) from ({}) t".format(query), values)[0][0]

    query, values = q.select(filters=filters)
    records_filtered = execute_db("select count(*) from ({}) t".format(query), values)[0][0]

    query, values = q.select(
        filters = filters,
        order   = order,
        limit   = {'start': int(request.args.get('start')), 'length': int(request.args.get('length'))},
    )
    rows = execute_db(query, values)

    return jsonify(
        draw            = int(request.args.get('draw')),
        recordsTotal    = records_total,
        recordsFiltered = records_filtered,
        data            = rows,
    )

Access-style pagination

Cursor-based pagination without offset. Useful for large tables where LIMIT x OFFSET y gets slow, or when you need to navigate record by record like in Microsoft Access.

q = Datasource(
    source  = 'sample_uk',
    columns = columns,
    order   = [{'field': 'id', 'order': 'asc'}],
    dbms    = 'SQLite',
    move    = 'forwards',
    filters = [{'field': 'id', 'value': last_seen_id, 'type': 'move'}],
)

query, values = q.select()

Use move='backwards' to go in reverse, or move='find' to jump to a specific record. Multi-column cursors are supported - just add more entries to filters with type='move'.

Full-text search (PostgreSQL only)

Two operators for full-text search with websearch_to_tsquery:

fts - for a pre-built tsvector column:

{'field': 'tsv_content', 'operator': 'fts', 'value': 'oxygen supply'}
# â†’ "tsv_content" @@ websearch_to_tsquery('italian', %s)

fts_query - builds the tsvector on the fly from a text column:

{'field': 'description', 'operator': 'fts_query', 'value': 'oxygen supply'}
# â†’ to_tsvector('italian', coalesce("description", '')) @@ websearch_to_tsquery('italian', %s)

Default language is italian. You can change it:

q = Datasource(..., fts_language='english')

Filter operators

Operator	Description
`=` `!=` `>` `>=` `<` `<=`	Comparison
`contains`	Case-sensitive substring (`LIKE '%x%'`)
`starts_with`	Case-sensitive prefix
`ends_with`	Case-sensitive suffix
`icontains`	Case-insensitive substring (`ILIKE`)
`istarts_with`	Case-insensitive prefix
`iends_with`	Case-insensitive suffix
`null` / `notnull`	NULL checks (no value bound)
`in` / `notin`	List membership
`reverse_in`	Value in a set of columns - `%s IN (col1, col2)`
`fts`	Full-text search on tsvector column (Pg only)
`fts_query`	Full-text search on text column (Pg only)

JSONB column access in PostgreSQL uses field->>key notation:

{'field': 'metadata->>status', 'operator': '=', 'value': 'active'}

Supported databases

DBMS	`dbms` value	Placeholder
PostgreSQL	`'Pg'`	`%s`
SQLite	`'SQLite'`	`?`
MySQL	`'mysql'`	`?`
Oracle	`'Oracle'`	`?`

License

MIT

Project details

Release history Release notifications | RSS feed

This version

0.4.2

Jul 2, 2026

0.4.1

Jul 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filtersql-0.4.2.tar.gz (13.6 kB view details)

Uploaded Jul 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

filtersql-0.4.2-py3-none-any.whl (10.0 kB view details)

Uploaded Jul 2, 2026 Python 3

File details

Details for the file filtersql-0.4.2.tar.gz.

File metadata

Download URL: filtersql-0.4.2.tar.gz
Upload date: Jul 2, 2026
Size: 13.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for filtersql-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`f0722384fe3ad08780f1dd23517e1891ef1eb617bd5e84f445538bcbda1752c3`
MD5	`0cfbdebe84dec5297e4c0b1712d8778f`
BLAKE2b-256	`795e9c0679d0dcad87e9b1c02bd344f3a10f86c11aff1caaee93427954c7bff2`

See more details on using hashes here.

File details

Details for the file filtersql-0.4.2-py3-none-any.whl.

File metadata

Download URL: filtersql-0.4.2-py3-none-any.whl
Upload date: Jul 2, 2026
Size: 10.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for filtersql-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ee5ed213b99c6b46f5b5845cc8cd1f386d3f778e0e983acd3e3f6e1a35a7fa11`
MD5	`ce1eced8d90034248696dd0196d4d0f6`
BLAKE2b-256	`9db276ae9b337fc093321298ab9a953d7b569a44cadf411ad495d3d521cfe9ee`

See more details on using hashes here.

filtersql 0.4.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

filtersql

Install

Quick start

Using filtersql with AI / LLM pipelines

Basic flow

Define the Pydantic schema

Call the LLM with structured output

Build the WHERE clause

A few things that help in practice

JSONB fields and value_type

Flask + DataTables example

Access-style pagination

Full-text search (PostgreSQL only)

Filter operators

Supported databases

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes