Skip to main content

Compiled regular expressions with auto-escaped interpolations

Project description

regex template

PyPI - Version PyPI - Python Version


Compiled regular expressions with auto-escaped interpolations using Python 3.14's t-strings.

This only supports Python 3.14 (which is not yet released) because it relies on t-strings.

The problem: escaping regular expressions

Have you ever tried to use user input or variables in a regular expression and run into escaping issues?

For example, if you want to match a file extension that's stored in a variable:

>>> import re
>>> extension = ".txt"
>>> pattern = re.compile(rf"^.*{extension}$")
>>> text = "filetxt"
>>> if pattern.search(text):
...     print(f"{text} matched")
... else:
...     print(f"{text} did not match")
...
filetxt matched

Special regular expression characters like ., *, +, and ? need to be properly escaped when used in regular expressions.

We can use the re.escape function to manually escape each replacement field:

>>> import re
>>> extension = ".txt"
>>> pattern = re.compile(rf"^.*{re.escape(extension)}$")
>>> text = "filetxt"
>>> if pattern.search(text):
...     print(f"{text} matched")
... else:
...     print(f"{text} did not match")
...
filetxt did not match

This is tedious, especially with multiple f-string replacement fields.

The solution: auto-escaping with t-strings

The regex_template.compile function automatically escapes interpolated variables when using t-strings, while leaving the main pattern unescaped:

>>> import regex_template as ret
>>> extension = ".txt"
>>> pattern = ret.compile(rt"^.*{extension}$")
>>> text = "filetxt"
>>> if pattern.search(text):
...     print(f"{text} matched")
... else:
...     print(f"{text} did not match")
...
filetxt did not match

Replacement fields ({...}) are automatically escaped.

Note that regex_template.compile only accepts t-strings.

Safe interpolation

If you need to ensure specific replacement fields that are not escaped, use the :safe format specifier:

>>> import regex_template as ret
>>> part = "[^/]+"
>>> pattern = ret.compile(rt"/home/({part:safe})/Documents")
>>> text = "/home/trey/Documents"
>>> if match := pattern.search(text):
...     print(f"Matched Documents for user {match[1]}")
...
Matched Documents for user trey

Format specifiers

All standard Python format specifiers work normally and are applied before escaping:

>>> import regex_template as ret
>>> tracks = [(1, "Gloria"), (2, "Redondo Beach")]
>>> filename = "01 Gloria.mp3"
>>> for n, name in tracks:
...     pattern = ret.compile(rt"{n:02d}\ {name}\.mp3")
...     if pattern.fullmatch(filename):
...         print(f"Track {n} found!")
...
Track 1 found!

Verbose mode

By default, regex_template.compile enables verbose mode (re.VERBOSE) to encourage the use of more readable regular expressions:

import regex_template as ret

username = "trey"
hostname = "farnsworth"

# SSH log entry pattern
pattern = ret.compile(rt"""
    ^
    (\w{{3}} \s+ \d{{1,2}}) \s+         # Month and day ("Jan 1")
    (\d{{2}} : \d{{2}} : \d{{2}}) \s+   # Time ("14:23:45")
    {hostname} \s+                      # Server hostname (auto-escaped)
    sshd \[\d+\] : \s+                  # sshd process
    Accepted \s+ \w+ \s+                # Authentication method
    for \s+ {username} \s+              # Username (auto-escaped)
    from \s+ ([\d.]+) \s+               # IP address
    port \s+ \d+                        # Port number
""")

with open("sshd.log") as log_file:
    for line in log_file:
        if match := pattern.search(line):
            print("Login from IP {match[1]}")

You can set verbose=False to disable this:

pattern = ret.compile(
    rt"^(\w+ \d+ \d+:\d+:\d+) {hostname} .* for {username} from ([\d.]+)",
    verbose=False,
)

Installation

You can install regex-template with pip (you'll need to be on Python 3.14):

pip install regex-template

Or if you have uv installed and you'd like to play with it right now (Python 3.14 will be auto-installed):

uvx --with regex-template python

You can then import regex_template like this:

import regex_template as ret

Testing

This project uses hatch.

To run the tests:

hatch test

To see code coverage:

hatch test --cover
hatch run cov-html
open htmlcov/index.html

License

regex-template is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

regex_template-0.0.1.tar.gz (81.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

regex_template-0.0.1-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file regex_template-0.0.1.tar.gz.

File metadata

  • Download URL: regex_template-0.0.1.tar.gz
  • Upload date:
  • Size: 81.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for regex_template-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5a94bf199a2bec0f1418b27212940d325844d45b57c3b43d10639b1a7607dfb1
MD5 e15a84608f3cb934e11ed49a94285b10
BLAKE2b-256 3cb5d6da2141f94286e9015f6ca074834bfecafaca662b47b852e99005d2949c

See more details on using hashes here.

File details

Details for the file regex_template-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for regex_template-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 aac60264a888fd009bb29c8f3eca7bd61bfb1b34067d5c017839045c1cd115a8
MD5 58656ae6d59a3b543029e3b99aebb2ff
BLAKE2b-256 c34f12fca292138c48d5db72769dfd534304a832f11e273eafffc4add67e3526

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page