Skip to main content

A wrapper for Python's re library for advanced regex pattern management

Project description

A wrapper for Python’s re library for advanced regex pattern management

Basic usage

The Engine loads Regular Expression pattern templates written in *.json files from the provided directory, builds and compiles them in the following fashion:

example of template models/dates.json:

{
  "day": [
    "3[01]",
    "[12][0-9]",
    "0?[1-9]"
  ],
  "month": [
    "0?[1-9]",
    "1[012]"
  ],
  "year": [
    "\\d{4}"
  ],
  "date": [
    "{{day}}/{{month}}/{{year}}",
    "{{year}}-{{month}}-{{day}}"
  ],
  "patterns": [
    "{{date}}"
  ]
}

will result in the following regex:

(?P<date_0>(?P<day_0>[12][0-9]|0?[1-9]|3[01])/(?P<month_0>0?[1-9]|1[012])/(?P<year_0>\d{4})|(?P<year_1>\d{4})-(?P<month_1>0?[1-9]|1[012])-(?P<day_1>[12][0-9]|0?[1-9]|3[01]))

It is possible to query as follows:

engine = Engine('models')
for match in parser.parse("Look at this date: 2012-20-10"):
    print(match)
    # <[Match date] span(19, 29): 2012-12-10>

    date = match.group('date')
    print(date)
    # <[Group date_0] span(19, 29): 2012-12-10>

    day = date.group('day')
    print(day)
    # <[Group day_1] span(27, 29): 10>

    month = date.group('month')
    print(month)
    # <[Group month_1] span(24, 26): 12>

    year = date.group('year')
    print(year)
    # [Group year_1] span(19, 23): 2012>

Match objects have the following attributes:

  • type: the type of match (e.g. “dates”);

  • match: the re.match object;

  • re: the regex pattern;

  • all_group_names: the name of all the children groups;

Both Match and Group objects have the following attributes:

  • value: the string value of the match/group

  • start: the beginning of the match/group relative to the input string

  • end: the end of the group relative to the input string

  • offset (start, end)

  • length (end-start)

Group objects have the following attributes:

  • name: the actual group name (e.g. date_1);

  • key: the group key (e.g. date);

Both Match and Group objects can be serialized in dicts with the serialize() method and to a json string with the json attribute

Secondary features

There are two useful secondary features:

  • non-capturing groups: these are specified by using the “!” prefix in the group name

  • dynamic backreferences: use # to reference a previous group and @<n> to specify how many groups behind

template:

{
  "!number": [
    "\\d"
  ],
  "abg": [
    "alpha",
    "beta",
    "gamma"
  ],
  "patterns": [
    "This is an unnamed number group: {{number}}.",
    "I can match {{abg}} and {{abg}}, and then re-match the last {{#abg}} or the second last {{#abg@2}}"
  ]
}

It will generate the following regexs:

This is an unnamed number group: (?:\d).

I can match (?P<abg_0>alpha|gamma|beta) and (?P<abg_1>alpha|gamma|beta), and then re-match the last (?P=abg_1) or the second last (?P=abg_0)

N.B.: in order to obtain an escape char, such as \d, in the pattern’s model it must be double escaped: \\d

Current limitations

None known

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

replus-0.0.1.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

replus-0.0.1-py2.py3-none-any.whl (7.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file replus-0.0.1.tar.gz.

File metadata

  • Download URL: replus-0.0.1.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for replus-0.0.1.tar.gz
Algorithm Hash digest
SHA256 ac803768a16a2cd9a9b8e327130b9953bdbe03907b52ff32e5bc40508dcbe977
MD5 b39d5af30b9628b1de3e9e9408806a11
BLAKE2b-256 a0c8594b70fa36e1777d3a43353cd1d501f2f86851b82740e4e18a512aab418b

See more details on using hashes here.

File details

Details for the file replus-0.0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: replus-0.0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for replus-0.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 14fc7e7aeb793a29b49adb1952ce013178787851cac54a1640c0116c34810d80
MD5 573a7f2988eb04c31342fcb05ed6938b
BLAKE2b-256 b6373c0ad76049b1d8c622f5545b7a59a58b57e5f91003224007e8e229e39951

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page