A lightweight wrapper for in‑memory SQLite FTS5 full‑text search in Python.
Project description
PyFTS5
PyFTS5 is a lightweight Python library that wraps an in‑memory SQLite database with FTS5 full‑text search capabilities. It provides a simple, Pythonic API to add documents (with optional IDs) and perform various full‑text searches using common operators. The library also leverages SQLite’s built‑in highlighting function to mark matched terms.
Features
- In‑Memory Database: Uses an in‑memory SQLite database for fast, lightweight searches.
- FTS5 Integration: Leverages SQLite’s C‑implemented FTS5 extension for efficient full‑text indexing and search.
- Document IDs: Allows you to insert documents with specific IDs, so that search results carry a persistent identifier.
- Helper Search Methods: Supports common queries:
- Generic search: Direct MATCH queries.
- Phrase search: Exact phrase matching.
- Prefix search: Match tokens beginning with a given prefix.
- Boolean operators: AND, OR, NOT combinations.
- NEAR search: Find terms within a specified distance.
- Highlighting: Optionally return search results with matched tokens highlighted (using SQLite’s
highlight()function). - Context Manager: Implements
__enter__and__exit__so that the connection is automatically closed when finished.
Installation
Via Poetry
Clone the repository and run:
poetry install
If published on PyPI, install via pip:
pip install PyFTS5
Usage
Below is a quick example:
from fts_search_db import FullTextSearchDB
# Prepare some documents as (doc_id, content) pairs.
docs = [
(101, "The quick brown fox jumps over the lazy dog"),
(102, "Never jump over the lazy dog quickly"),
(103, "A quick movement of the enemy will jeopardize six gunboats"),
(104, "Quick thinking leads to quick decisions"),
]
# Use the class as a context manager for automatic cleanup.
with FullTextSearchDB(docs) as fts_db:
# Perform a generic search (without highlighting).
results = fts_db.search("quick AND dog")
for doc_id, content in results:
print(f"Doc {doc_id}: {content}")
# Perform a search with highlighting enabled.
highlighted = fts_db.search("quick AND dog", highlight=True, hl_prefix="<<", hl_suffix=">>")
for doc_id, hl_text, content in highlighted:
print(f"Doc {doc_id} (highlighted): {hl_text}")
Testing
Unit tests are written with pytest. To run the tests, simply run:
poetry run pytest
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyfts5-0.1.0.tar.gz.
File metadata
- Download URL: pyfts5-0.1.0.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
545a566c948391f663113642b73ac08b642313c959772eb5c7100d2781086fc6
|
|
| MD5 |
acafefacf5656bf8554c0d56ce83896a
|
|
| BLAKE2b-256 |
31d540fb3a04d0c4772e069bf56a1168cbf8b40333f65eee5df1086bff4bdc90
|
Provenance
The following attestation bundles were made for pyfts5-0.1.0.tar.gz:
Publisher:
python-publish.yml on raulsperoni/PyFTS5
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyfts5-0.1.0.tar.gz -
Subject digest:
545a566c948391f663113642b73ac08b642313c959772eb5c7100d2781086fc6 - Sigstore transparency entry: 171377556
- Sigstore integration time:
-
Permalink:
raulsperoni/PyFTS5@a01e80fa83f4ea1a2a9ad3067f825c5ecfd3c61f -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/raulsperoni
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@a01e80fa83f4ea1a2a9ad3067f825c5ecfd3c61f -
Trigger Event:
push
-
Statement type:
File details
Details for the file pyfts5-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pyfts5-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d946f747a1c7b0cd517edcae12d4f22ac9cc273458be41a0bd9fa7947017dec0
|
|
| MD5 |
eb0eecaf3893523a0d8b14f903916d3b
|
|
| BLAKE2b-256 |
1f208b0b87ccc269adc86345e1f34e234b92214776a84a525ccfd32310ca4aa2
|
Provenance
The following attestation bundles were made for pyfts5-0.1.0-py3-none-any.whl:
Publisher:
python-publish.yml on raulsperoni/PyFTS5
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyfts5-0.1.0-py3-none-any.whl -
Subject digest:
d946f747a1c7b0cd517edcae12d4f22ac9cc273458be41a0bd9fa7947017dec0 - Sigstore transparency entry: 171377557
- Sigstore integration time:
-
Permalink:
raulsperoni/PyFTS5@a01e80fa83f4ea1a2a9ad3067f825c5ecfd3c61f -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/raulsperoni
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@a01e80fa83f4ea1a2a9ad3067f825c5ecfd3c61f -
Trigger Event:
push
-
Statement type: