qante - Query ANnotated TExt
Project description
Motivation
Extracting the highly-valuable data from unstructured text often results in hard-to-read, brittle, difficult-to-maintain code. The problem is that using regular expressions directly embedded in the program control flow does not provide the best level of abstraction. We propose a query language (based on the tuple relational calculus) that facilitates data extraction. Developers can explicitly express their intent declaratively, making their code much easier to write, read, and maintain.
Solution
This package allows programmers to express what they are searching for by using higher-level concepts to express their query as tags, locations, and expressions on location relations.
The location of a string of characters within the document is the interval defining its starting and ending position.
Locations are grouped into sets named by tags. Tags can be used in conjunctions and disjunctions of interval relations to query for tuples of locations.
Documentation
We invite you to view our YouTube video of our presentation from the Playlist for PyData Global 2022.
We presented this material from our GitHub repo:
pydataG22.pdf slides of our talk.
ipynb/pydata.ipynb a jupyter notebook with examples.
RELEASE_NOTES.rst describes updates for each release.
Use one of these pip or python commands (rev 3 or above) to install from PyPI:
pip install qante python -m pip install qante
Use python docstrings for API Documentation:
python # rev 3 or above from quante.tagger import Tagger from quante.query import Query from quante import loctuple as lt from quant.table import get_table help(Tagger) # annotate text with tags using tagRE('tagname', regexp) help(Query) # Syntax for querying annotated text help(lt) # Predicates used by queries help(get_table) # extract table (as dictionaries) from text
See also: “API Documentation” at the end of our jupyter notebook.
We welcome your questions by electronic mail at: qante{at}asgard.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.