Skip to main content

A lightweight wrapper for in‑memory SQLite FTS5 full‑text search in Python.

Project description

PyFTS5

PyFTS5 is a lightweight Python library that wraps an in‑memory SQLite database with FTS5 full‑text search capabilities. It provides a simple, Pythonic API to add documents (with optional IDs) and perform various full‑text searches using common operators. The library also leverages SQLite’s built‑in highlighting function to mark matched terms.

Features

  • In‑Memory Database: Uses an in‑memory SQLite database for fast, lightweight searches.
  • FTS5 Integration: Leverages SQLite’s C‑implemented FTS5 extension for efficient full‑text indexing and search.
  • Document IDs: Allows you to insert documents with specific IDs, so that search results carry a persistent identifier.
  • Helper Search Methods: Supports common queries:
    • Generic search: Direct MATCH queries.
    • Phrase search: Exact phrase matching.
    • Prefix search: Match tokens beginning with a given prefix.
    • Boolean operators: AND, OR, NOT combinations.
    • NEAR search: Find terms within a specified distance.
  • Highlighting: Optionally return search results with matched tokens highlighted (using SQLite’s highlight() function).
  • Context Manager: Implements __enter__ and __exit__ so that the connection is automatically closed when finished.

Installation

Via Poetry

Clone the repository and run:

poetry install

If published on PyPI, install via pip:

pip install PyFTS5

Usage

Below is a quick example:

from fts_search_db import FullTextSearchDB

# Prepare some documents as (doc_id, content) pairs.
docs = [
    (101, "The quick brown fox jumps over the lazy dog"),
    (102, "Never jump over the lazy dog quickly"),
    (103, "A quick movement of the enemy will jeopardize six gunboats"),
    (104, "Quick thinking leads to quick decisions"),
]

# Use the class as a context manager for automatic cleanup.
with FullTextSearchDB(docs) as fts_db:
    # Perform a generic search (without highlighting).
    results = fts_db.search("quick AND dog")
    for doc_id, content in results:
        print(f"Doc {doc_id}: {content}")

    # Perform a search with highlighting enabled.
    highlighted = fts_db.search("quick AND dog", highlight=True, hl_prefix="<<", hl_suffix=">>")
    for doc_id, hl_text, content in highlighted:
        print(f"Doc {doc_id} (highlighted): {hl_text}")

Testing

Unit tests are written with pytest. To run the tests, simply run:

poetry run pytest

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfts5-0.1.0.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyfts5-0.1.0-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file pyfts5-0.1.0.tar.gz.

File metadata

  • Download URL: pyfts5-0.1.0.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pyfts5-0.1.0.tar.gz
Algorithm Hash digest
SHA256 545a566c948391f663113642b73ac08b642313c959772eb5c7100d2781086fc6
MD5 acafefacf5656bf8554c0d56ce83896a
BLAKE2b-256 31d540fb3a04d0c4772e069bf56a1168cbf8b40333f65eee5df1086bff4bdc90

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyfts5-0.1.0.tar.gz:

Publisher: python-publish.yml on raulsperoni/PyFTS5

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyfts5-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyfts5-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pyfts5-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d946f747a1c7b0cd517edcae12d4f22ac9cc273458be41a0bd9fa7947017dec0
MD5 eb0eecaf3893523a0d8b14f903916d3b
BLAKE2b-256 1f208b0b87ccc269adc86345e1f34e234b92214776a84a525ccfd32310ca4aa2

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyfts5-0.1.0-py3-none-any.whl:

Publisher: python-publish.yml on raulsperoni/PyFTS5

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page