Skip to main content

An information extraction focused regex library that uses constant-delay algorithms.

Project description

Implementation of constant delay algorithm for regular document spanners

This implementation is based on the paper Constant delay algorithms for regular document spanners by Fernando Florenzano, Cristian Riveros, Martín Ugarte, Stijn Vansummeren and Domagoj Vrgoč.

Directory structure

The C++ implementation is under /src folder.

The /exp folder contains different experiments to compare our library with others.

The /tests folder contains all the automatic tests for our code.

Build instructions

Python/SWIG

Assuming that you are in a Debian-based distro, first install the following dependencies:

sudo apt install g++ cmake swig libboost-dev python3-dev

After that, in this directory, run:

mkdir -pv build && cd build
cmake -DSWIG=true ..
make

After the compilation process there will be a rematch.py (the bindings interface) and a _rematchswiglib.so (the shared lib binary) in build/bin/SWIG that you can use for interfacing REmatch via Python.

CLI tool

cmake -H. -Bbuild/Release
cmake --build build/Release

If you want to use a debugger such as gdb, then you should add -DCMAKE_BUILD_TYPE=Debug in the first CMake command.

Command line use

After building, the binary file will be located in the build/Release/bin folder. To try it simply run:

build/Release/bin/rematch --help

Examples:

Get all spans corresponding to a single letter a:

build/Release/bin/rematch -d document.txt -e '.*!x{a}.*'

Get all spans corresponding to a pattern in a file:

build/Release/bin/rematch -d document.txt -r regex.txt

Get benchmark stats (execution time, number of outputs, memory usage, etc.):

build/Release/bin/rematch -d document.txt -r regex.txt -o benchmark

Testing

We are using Boost.Test for unit testing.

To add more tests, add a new folder inside tests/[test_name_folder]/ that starts with the word test as a prefix. Follow the same structure (same file names) of the other folders.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pyrematch-0.1.0-cp39-cp39-win_amd64.whl (178.2 kB view details)

Uploaded CPython 3.9 Windows x86-64

pyrematch-0.1.0-cp39-cp39-win32.whl (148.0 kB view details)

Uploaded CPython 3.9 Windows x86

pyrematch-0.1.0-cp38-cp38-win_amd64.whl (178.5 kB view details)

Uploaded CPython 3.8 Windows x86-64

pyrematch-0.1.0-cp38-cp38-win32.whl (148.1 kB view details)

Uploaded CPython 3.8 Windows x86

pyrematch-0.1.0-cp37-cp37m-win_amd64.whl (178.5 kB view details)

Uploaded CPython 3.7m Windows x86-64

pyrematch-0.1.0-cp37-cp37m-win32.whl (148.1 kB view details)

Uploaded CPython 3.7m Windows x86

pyrematch-0.1.0-cp36-cp36m-win_amd64.whl (178.7 kB view details)

Uploaded CPython 3.6m Windows x86-64

pyrematch-0.1.0-cp36-cp36m-win32.whl (148.1 kB view details)

Uploaded CPython 3.6m Windows x86

pyrematch-0.1.0-cp35-cp35m-win_amd64.whl (178.2 kB view details)

Uploaded CPython 3.5m Windows x86-64

pyrematch-0.1.0-cp35-cp35m-win32.whl (147.9 kB view details)

Uploaded CPython 3.5m Windows x86

File details

Details for the file pyrematch-0.1.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 178.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for pyrematch-0.1.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 37cc354a3a5f0e92d33fa2148ab67eb45857062d9e563c32046719aea4cbe423
MD5 ed3a7ea38d9e3b8f9661bcf3c0c586b9
BLAKE2b-256 3b138a46b9b0928bb85621a64b1b38863404d369e34391c1a5986200b81cc840

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp39-cp39-win32.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp39-cp39-win32.whl
  • Upload date:
  • Size: 148.0 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for pyrematch-0.1.0-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 29ad49fa6921bcf75b30807957c6629d7703916b74da3f426363a0e1dc3ba44c
MD5 f2080f2835a2c447c686990f001156c8
BLAKE2b-256 a65fbd67b7e367ec31f5d2738623108297d4a0b5fd90baa84674383e7707e978

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 178.5 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for pyrematch-0.1.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 030c1b8cd85fae1e5c0cdb1a027501cae655ba70fa08a3df505e102c86cc7500
MD5 e3c2320faa7b33fb7a183fed0ed1075b
BLAKE2b-256 b1923cfd165349dc892ad5e4ba8b4ff55c80b2401affff27098d3b4b3f9563b0

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp38-cp38-win32.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp38-cp38-win32.whl
  • Upload date:
  • Size: 148.1 kB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for pyrematch-0.1.0-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 0463319cef2ad3a648012b2e6b918608d6ead16e43516b5982ed8c4b9378c0b8
MD5 40238f7b233c499864290c333fba96a4
BLAKE2b-256 10a41274df3c936c2e1a81025b92bd628a0d07d1f29dae267fc402f8f9df0899

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 178.5 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9

File hashes

Hashes for pyrematch-0.1.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 a4e6ad486938464d1ef6da85a1eee6c97fde625fc835120f36e01c2a4b420497
MD5 45644910a0cc6bee7430364891238eb2
BLAKE2b-256 a144f4e289423c3fb464ff34d59b6493d7a96ecd062157693ace885b945403c7

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp37-cp37m-win32.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 148.1 kB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9

File hashes

Hashes for pyrematch-0.1.0-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 0f3fc622941307383d7981d08b726eebd9e923c3a35dc6acce98567b7842e55f
MD5 de063c62226bc41f374f806bad7e76ca
BLAKE2b-256 a28403fdcfa235ee9f94f76436fff60cb66e9055f266599ef73bd4e085f4ae49

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 178.7 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.6.8

File hashes

Hashes for pyrematch-0.1.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 290a82901528339bad15d7163cf81919cf269e9e5469c4d9b5c66e40304983ed
MD5 4263a7b5da1057f4276ab9a96f1330d7
BLAKE2b-256 61d5d2ad3e8c00db1aa64b73f09c1bc4f1ed6be79df9a8ad06eede6d413ab6fb

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp36-cp36m-win32.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp36-cp36m-win32.whl
  • Upload date:
  • Size: 148.1 kB
  • Tags: CPython 3.6m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.6.8

File hashes

Hashes for pyrematch-0.1.0-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 f361323c8c50cfe88d1c871719c72eb4e53edb36f0befd5c50ad3997b7311be8
MD5 9f2ae30170739a09bd566f9964375a6a
BLAKE2b-256 4c9453a4b5c510099fa05d4e41d7bc1ba434fc10c06210b63de5c724284e276d

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 178.2 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.5.4

File hashes

Hashes for pyrematch-0.1.0-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 7ff271c9e97e11c9b8ab48ec16e4f0612712b9f7eaff20ab2291c8e5f912daad
MD5 7c6d3f107063c22a78c5095c3647849c
BLAKE2b-256 c41c38a862a5037e026da3267802f2c14f0af2992d9e89414d333c1d9831c10c

See more details on using hashes here.

File details

Details for the file pyrematch-0.1.0-cp35-cp35m-win32.whl.

File metadata

  • Download URL: pyrematch-0.1.0-cp35-cp35m-win32.whl
  • Upload date:
  • Size: 147.9 kB
  • Tags: CPython 3.5m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.5.4

File hashes

Hashes for pyrematch-0.1.0-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 3118b11ac7b9873314067fff80b58d81ce9c529f99b844bd92fc671a97f75f1c
MD5 f3f5920efc41dbf0754ce932d0478302
BLAKE2b-256 fd007dfa2f542cb09dbdc723bfaa42fc4324ece0f1e5b4cfde4887fbd21e2bfc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page