Skip to main content

A back-pocket regex cookbook

Project description

re101

PyPI version License Python versions

A compendium of commonly-used regular expressions.

This package pertains specifically to regular expressions embedded inside Python and compiled with Python's re module.

Tested on Python 3.10, 3.11, 3.12, 3.13, and 3.14.

Install

uv add re101        # or: pip install re101

Develop

This project uses uv for environment and build management (uv_build backend), ruff for lint/format, and ty for type-checking.

uv sync
uv run pytest
uv run ruff check
uv run ty check

Introduction

All importable objects are compiled regular expressions. For instance, US_PHONENUM matches sequences following the North American Number Plan (NANP) format. In plain English, this is what would qualify as a "North American telephone number":

>>> from re101 import US_PHONENUM
>>> text = """
... Ross McFluff: +1 (834) 345.1254 155 Elm Street
... Ronald Heathmore: 892-345-3428 436 Finley Avenue
... Frank Burger: 541-7625 662 South Dogwood Way
... Heather Albrecht: 5483264584 919 Park Place"""

>>> US_PHONENUM.findall(text)
['+1 (834) 345.1254', '892-345-3428', '541-7625', '5483264584']

Currently, the package supports regexes related to:

  • email addresses
  • whitespace
  • words/tokens
  • phone numbers
  • IP addresses
  • URLs
  • integers, decimals, numbers
  • geographic information
  • personally identifiable information

Naming Conventions

Objects exported by the package may be in either UPPERCASE, CamelCase, or lower_case:

  • UPPERCASE: These are compiled regular expressions, of type re.Pattern[str], which is the result of re.compile().
  • CamelCase: These are classes whose __new__() method returns a compiled regular expression, but takes a few additional parameters that add optionality to the compiled result. For instance, the Number class lets you allow or disallow leading zeros and commas.
  • lower_case: These are traditional functions built around the package's regex constants. They do not share any consistency in their call syntax or result type.

Disclaimer

Use these regular expressions with care. It is unlikely that any of them cover 100.00% of the cases that they are intended to cover. They are built to handle "99.x%" of cases. With all regular expressions, a balance must be made: covering an incremental 0.1% of cases often requires a large marginal amount of work and code.

If you do notice egregious mistakes or omissions, please consider submitting an issue or pull request.

Please assume these expressions are "US-centric" unless noted otherwise. For instance, the zipcodes expression looks only for XXXXX or XXXXX-XXXX zip codes.

Sources

Citations are included for "unique" regexes that are copied from a singular source. More "generic" regexes that can be found in similar form from multiple public sources may not be cited here.

  1. Goyvaerts, Jan & Steven Levithan. Regular Expressions Cookbook, 2nd ed. Sebastopol: O'Reilly, 2012.
  2. Friedl, Jeffrey. Mastering Regular Expressions, 3rd ed. Sebastopol: O'Reilly, 2009.
  3. Goyvaerts, Jan. Regular Expressions: The Complete Tutorial. https://www.regular-expressions.info/.
  4. Python.org documentation: re module. https://docs.python.org/3/library/re.html
  5. Kuchling, A.M. "Regular Expression HOWTO." https://docs.python.org/3/howto/regex.html
  6. Python.org documentation: ipaddress module. Copyright 2007 Google Inc. Licensed to PSF under a Contributor Agreement. https://docs.python.org/3/library/ipaddress.html
  7. nerdsrescueme/regex.txt. https://gist.github.com/nerdsrescueme/1237767

To-Do List

These patterns are not currently implemented:

  • Dates and times (both ISO-8601 and more informal, such as those that can be parsed by Python's dateutil)
  • Money/currency (including both the leading or trailing sign, numbers, and punctuation)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

re101-1.0.0.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

re101-1.0.0-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file re101-1.0.0.tar.gz.

File metadata

  • Download URL: re101-1.0.0.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for re101-1.0.0.tar.gz
Algorithm Hash digest
SHA256 3d38a0987f32f535d6fd530d05bc431ae7e43c4751b60674c2f6990808f58fab
MD5 067baebd6ff3bfc55a3f615efc0dff99
BLAKE2b-256 ef4ac76e3544fd434de8e017d9f5e024df171a058b2468621574f1d59654182f

See more details on using hashes here.

File details

Details for the file re101-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: re101-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for re101-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e87928d909ea34dd25aec8cdba95852d28f941eaa8bb62e977d48cd8bb131a73
MD5 5b785af2361859ae609004d35a06e6ee
BLAKE2b-256 49a5f2d99037747ddd9d4a7d41162f6ccd6d9814991124a13a4c40c0968b3806

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page