Skip to main content

Modification of cpython `re` module in which patterns match any string that is a prefix of a matching string under its regex

Project description

Modified Regex for Syntactically Constrained LLM Sampling

The scs_re package is a modified version of the python re module that returns a match for any prefix of a string matching the original regex pattern. For example, the pattern ^abcd matches the string abc. This behavior is used to consider the validity of beams under a particular regex pattern during a language model's output generation.

Note: This behavior is only expected for patterns that begin with ^ and end with $ and therefore define a match contstraint over all characters in a sequence.

Building from source

We use the _sre module from the python 3.11 branch of Python/cpython/Modules/_sre. You can clone the original source by running

svn export https://github.com/python/cpython/branches/3.11/Modules/_sre _sre

The module requires python 3.11 headers to compile. To install run

wget https://www.python.org/ftp/python/3.11.3/Python-3.11.3.tar.xz
tar xf Python-3.11.3.tar.xz
cd Python-3.11.3
./configure --enable-optimizations --enable-shared
make -j$(nproc)
sudo make altinstall

Then you can install the package with sudo python3.11 setup.py install to install. The python version you install to must be the same version whose headers were compiled against.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scs_re-1.0.2-py3.11-linux-x86_64.egg (313.5 kB view details)

Uploaded Egg

File details

Details for the file scs_re-1.0.2-py3.11-linux-x86_64.egg.

File metadata

  • Download URL: scs_re-1.0.2-py3.11-linux-x86_64.egg
  • Upload date:
  • Size: 313.5 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for scs_re-1.0.2-py3.11-linux-x86_64.egg
Algorithm Hash digest
SHA256 ee59fc38b57811ce277ed279d9ec151382cedc83aa5cbd6156fd4e3e6a0ed3a0
MD5 070bfdaa787db037db9c73a0ff6a2de7
BLAKE2b-256 eeee30fe8c65b3b9de5f6ef1be3d0aeadcecd00a0dc1edf964ecbfb0bb14a0b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page