Modification of cpython `re` module in which patterns match any string that is a prefix of a matching string under its regex
Project description
Modified Regex for Syntactically Constrained LLM Sampling
The scs_re package is a modified version of the python re module that returns a match for any prefix of a string matching the original regex pattern. For example, the pattern ^abcd matches the string abc. This behavior is used to consider the validity of beams under a particular regex pattern during a language model's output generation.
Note: This behavior is only expected for patterns that begin with ^ and end with $ and therefore define a match contstraint over all characters in a sequence.
Building from source
We use the _sre module from the python 3.11 branch of Python/cpython/Modules/_sre. You can clone the original source by running
svn export https://github.com/python/cpython/branches/3.11/Modules/_sre _sre
The module requires python 3.11 headers to compile. To install run
wget https://www.python.org/ftp/python/3.11.3/Python-3.11.3.tar.xz
tar xf Python-3.11.3.tar.xz
cd Python-3.11.3
./configure --enable-optimizations --enable-shared
make -j$(nproc)
sudo make altinstall
Then you can install the package with sudo python3.11 setup.py install to install. The python version you install to must be the same version whose headers were compiled against.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scs_re-1.0.2-py3.11-linux-x86_64.egg.
File metadata
- Download URL: scs_re-1.0.2-py3.11-linux-x86_64.egg
- Upload date:
- Size: 313.5 kB
- Tags: Egg
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee59fc38b57811ce277ed279d9ec151382cedc83aa5cbd6156fd4e3e6a0ed3a0
|
|
| MD5 |
070bfdaa787db037db9c73a0ff6a2de7
|
|
| BLAKE2b-256 |
eeee30fe8c65b3b9de5f6ef1be3d0aeadcecd00a0dc1edf964ecbfb0bb14a0b3
|