Skip to main content

qante - QUery ANnotated TeXt

Project description

Motivation

Extracting the highly-valuable data from unstructured text often results in hard-to-read, brittle, difficult-to-maintain code. The problem is that using regular expressions directly embedded in the program control flow does not provide the best level of abstraction. We propose a query language (based on the tuple relational calculus) that facilitates data extraction. Developers can explicitly express their intent declaratively, making their code much easier to write, read, and maintain.

Solution

This package allows programmers to express what they are searching for by using higher-level concepts to express their query as tags, locations, and expressions on location relations.

The location of a string of characters within the document is the interval defining its starting and ending position.

Locations are grouped into sets named by tags. Tags can be used in conjunctions and disjunctions of interval relations to query for tuples of locations.

We invite you to view our talk on PyData Global 2022.

Use pip or python (rev 3 or above) to install from PyPI:

pip install qante
python -m pip install qante

API Documentation is available from python docstrings:

python    # rev 3 or above
  import qante, qante.loc
  help(qante)
  help(qante.loc)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qante-0.0.1.tar.gz (28.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page