An information extraction focused regex library that uses constant-delay algorithms.
Project description
Implementation of constant delay algorithm for regular document spanners
This implementation is based on the paper Constant delay algorithms for regular document spanners
by Fernando Florenzano, Cristian Riveros, Martín Ugarte, Stijn Vansummeren and Domagoj Vrgoč.
Directory structure
The C++ implementation is under /src
folder.
The /exp
folder contains different experiments to compare our library with others.
The /tests
folder contains all the automatic tests for our code.
Build instructions
Python/SWIG
Assuming that you are in a Debian-based distro, first install the following dependencies:
sudo apt install g++ cmake swig libboost-dev python3-dev
After that, in this directory, run:
mkdir -pv build && cd build
cmake -DSWIG=true ..
make
After the compilation process there will be a rematch.py
(the bindings interface) and a _rematchswiglib.so
(the shared lib binary) in build/bin/SWIG
that you can use for interfacing REmatch via Python.
CLI tool
cmake -H. -Bbuild/Release
cmake --build build/Release
If you want to use a debugger such as gdb
, then you should add -DCMAKE_BUILD_TYPE=Debug
in the first CMake command.
Command line use
After building, the binary file will be located in the build/Release/bin
folder. To try it simply run:
build/Release/bin/rematch --help
Examples:
Get all spans corresponding to a single letter a
:
build/Release/bin/rematch -d document.txt -e '.*!x{a}.*'
Get all spans corresponding to a pattern in a file:
build/Release/bin/rematch -d document.txt -r regex.txt
Get benchmark stats (execution time, number of outputs, memory usage, etc.):
build/Release/bin/rematch -d document.txt -r regex.txt -o benchmark
Testing
We are using Boost.Test for unit testing.
To add more tests, add a new folder inside tests/[test_name_folder]/
that starts with the word test as a
prefix. Follow the same structure (same file names) of the other folders.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file pyrematch-0.1.0-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 178.2 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37cc354a3a5f0e92d33fa2148ab67eb45857062d9e563c32046719aea4cbe423 |
|
MD5 | ed3a7ea38d9e3b8f9661bcf3c0c586b9 |
|
BLAKE2b-256 | 3b138a46b9b0928bb85621a64b1b38863404d369e34391c1a5986200b81cc840 |
File details
Details for the file pyrematch-0.1.0-cp39-cp39-win32.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp39-cp39-win32.whl
- Upload date:
- Size: 148.0 kB
- Tags: CPython 3.9, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29ad49fa6921bcf75b30807957c6629d7703916b74da3f426363a0e1dc3ba44c |
|
MD5 | f2080f2835a2c447c686990f001156c8 |
|
BLAKE2b-256 | a65fbd67b7e367ec31f5d2738623108297d4a0b5fd90baa84674383e7707e978 |
File details
Details for the file pyrematch-0.1.0-cp38-cp38-win_amd64.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 178.5 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 030c1b8cd85fae1e5c0cdb1a027501cae655ba70fa08a3df505e102c86cc7500 |
|
MD5 | e3c2320faa7b33fb7a183fed0ed1075b |
|
BLAKE2b-256 | b1923cfd165349dc892ad5e4ba8b4ff55c80b2401affff27098d3b4b3f9563b0 |
File details
Details for the file pyrematch-0.1.0-cp38-cp38-win32.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp38-cp38-win32.whl
- Upload date:
- Size: 148.1 kB
- Tags: CPython 3.8, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0463319cef2ad3a648012b2e6b918608d6ead16e43516b5982ed8c4b9378c0b8 |
|
MD5 | 40238f7b233c499864290c333fba96a4 |
|
BLAKE2b-256 | 10a41274df3c936c2e1a81025b92bd628a0d07d1f29dae267fc402f8f9df0899 |
File details
Details for the file pyrematch-0.1.0-cp37-cp37m-win_amd64.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 178.5 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a4e6ad486938464d1ef6da85a1eee6c97fde625fc835120f36e01c2a4b420497 |
|
MD5 | 45644910a0cc6bee7430364891238eb2 |
|
BLAKE2b-256 | a144f4e289423c3fb464ff34d59b6493d7a96ecd062157693ace885b945403c7 |
File details
Details for the file pyrematch-0.1.0-cp37-cp37m-win32.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp37-cp37m-win32.whl
- Upload date:
- Size: 148.1 kB
- Tags: CPython 3.7m, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f3fc622941307383d7981d08b726eebd9e923c3a35dc6acce98567b7842e55f |
|
MD5 | de063c62226bc41f374f806bad7e76ca |
|
BLAKE2b-256 | a28403fdcfa235ee9f94f76436fff60cb66e9055f266599ef73bd4e085f4ae49 |
File details
Details for the file pyrematch-0.1.0-cp36-cp36m-win_amd64.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp36-cp36m-win_amd64.whl
- Upload date:
- Size: 178.7 kB
- Tags: CPython 3.6m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 290a82901528339bad15d7163cf81919cf269e9e5469c4d9b5c66e40304983ed |
|
MD5 | 4263a7b5da1057f4276ab9a96f1330d7 |
|
BLAKE2b-256 | 61d5d2ad3e8c00db1aa64b73f09c1bc4f1ed6be79df9a8ad06eede6d413ab6fb |
File details
Details for the file pyrematch-0.1.0-cp36-cp36m-win32.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp36-cp36m-win32.whl
- Upload date:
- Size: 148.1 kB
- Tags: CPython 3.6m, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f361323c8c50cfe88d1c871719c72eb4e53edb36f0befd5c50ad3997b7311be8 |
|
MD5 | 9f2ae30170739a09bd566f9964375a6a |
|
BLAKE2b-256 | 4c9453a4b5c510099fa05d4e41d7bc1ba434fc10c06210b63de5c724284e276d |
File details
Details for the file pyrematch-0.1.0-cp35-cp35m-win_amd64.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp35-cp35m-win_amd64.whl
- Upload date:
- Size: 178.2 kB
- Tags: CPython 3.5m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.5.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ff271c9e97e11c9b8ab48ec16e4f0612712b9f7eaff20ab2291c8e5f912daad |
|
MD5 | 7c6d3f107063c22a78c5095c3647849c |
|
BLAKE2b-256 | c41c38a862a5037e026da3267802f2c14f0af2992d9e89414d333c1d9831c10c |
File details
Details for the file pyrematch-0.1.0-cp35-cp35m-win32.whl
.
File metadata
- Download URL: pyrematch-0.1.0-cp35-cp35m-win32.whl
- Upload date:
- Size: 147.9 kB
- Tags: CPython 3.5m, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.5.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3118b11ac7b9873314067fff80b58d81ce9c529f99b844bd92fc671a97f75f1c |
|
MD5 | f3f5920efc41dbf0754ce932d0478302 |
|
BLAKE2b-256 | fd007dfa2f542cb09dbdc723bfaa42fc4324ece0f1e5b4cfde4887fbd21e2bfc |