Skip to main content

python port of super-expressive

Project description

super-expressive-python

This is a python port of super-expressive. It allows regular expressions to be expressed in python syntax and idioms. This allows for better editor and version control support, and makes them easier to read and edit.

Notes on implementation choices are in notes.md.

Example

>>> import superexpressive as se
>>> se.to_regex(
...     se.START_OF_INPUT,
...     se.optional("0x"),
...     se.capture(se.range(("a", "f"), ("A", "F"), ("0", "9")), se.exactly(4)),
...     se.END_OF_INPUT,
...     compile=False
... )
'^(?:0x)?([a-fA-F0-9]{4})$'

Installation

pip install super-expressive-python

API

superexpressive.ANY_CHAR( = '.')

Matches any character except a newline.

superexpressive.CARRIAGE_RETURN( = '\\r')

Matches a carriage return.

superexpressive.DIGIT( = '\\d')

Matches any digit character, is the equivalent of range 0-9

superexpressive.END_OF_INPUT( = '$')

Matches the end of the string or just before the newline at the end of the string.

superexpressive.NEWLINE( = '\\n')

Matches a newline character.

superexpressive.NON_DIGIT( = '\\d')

Matches any non-digit character, this is the inverse of DIGIT

superexpressive.NON_WHITESPACE_CHAR( = '\\S')

Matches any non-whitespace character, this is the inverse of WHITESPACE_CHAR

superexpressive.NON_WORD( = '\\W')

Matches the complement of WORD

superexpressive.NON_WORD_BOUNDARY( = '\\B')

Matches the empty string, but not at the start or end of a word.

superexpressive.ONE_OR_MORE( = '+')

Matches 1 or more (greedy) repetitions of the preceding expression

superexpressive.ONE_OR_MORE_LAZY( = '+?')

Non-greedy match for one or more repetitions of the previous expression

superexpressive.OPTIONAL( = '?')

Matches 0 or 1 (greedy) of the preceding RE.

superexpressive.START_OF_INPUT( = '^')

Matches the start of the string.

superexpressive.TAB( = '\\t')

Matches a tab character.

superexpressive.WHITESPACE_CHAR( = '\\s')

Matches any whitespace character

superexpressive.WORD( = '\\w')

Matches any alphanumeric character a-z, A-Z, 0-9, or underscore in bytes patterns or string patterns with the ASCII flag. In string patterns without the ASCII flag, it will match the range of Unicode alphanumeric characters (letters plus digits plus underscore).

superexpressive.WORD_BOUNDARY( = '\\b')

Matches the empty string, but only at the start or end of a word.

superexpressive.ZERO_OR_MORE( = '*')

Matches 0 or more (greedy) repetitions of the preceding RE. Greedy means that it will match as many repetitions as possible.

superexpressive.ZERO_OR_MORE_LAZY( = '*?')

Non-greedy version of the zero or more match

superexpressive.any_of(*args)

Match any of the given arguments.

>>> import superexpressive as se
>>> se.any_of('A', 'F', 'dkja')
'(?:A|F|dkja)'

TODO: is a non-capturing group really neccesary here?

  • Return type

    str

superexpressive.any_of_chars(*args)

A length 1 item that matches any of the included characters.

>>> import superexpressive as se
>>> se.any_of_chars('A', 'F', 'dkja')
'[AFdkja]'
  • Return type

    str

superexpressive.anything_but_chars(*args)

A length 1 item that matches anything but the included characters.

>>> import superexpressive as se
>>> se.anything_but_chars('A', 'F', 'dkja')
'[^AFdkja]'
  • Return type

    str

superexpressive.anything_but_range(*args)

An item that matches anything but a range of characters.

>>> import superexpressive as se
>>> se.anything_but_range(('A', 'F'))
'[^A-F]'
  • Return type

    str

superexpressive.anything_but_string(string)

Match anything except the provided string.

>>> import superexpressive as se
>>> se.anything_but_string('test')
'(?:[^t][^e][^s][^t])'
  • Return type

    str

superexpressive.assert_ahead(*args)

Check, but do not consume, that the regex matches the next part of the string.

>>> import superexpressive as se
>>> se.assert_ahead('test')
'(?=test)'

TODO: actual example of using this

  • Return type

    str

superexpressive.assert_behind(*args)

Check, that the regex matches the previous part of the string.

>>> import superexpressive as se
>>> se.assert_behind('test')
'(?<=test)'

TODO: actual example of using this

  • Return type

    str

superexpressive.assert_not_ahead(*args)

Check, but do not consume, that the regex does not match the next part of the string.

>>> import superexpressive as se
>>> se.assert_not_ahead('test')
'(?!test)'

TODO: actual example of using this

  • Return type

    str

superexpressive.assert_not_behind(*args)

Check, that the regex does not match the previous part of the string.

>>> import superexpressive as se
>>> se.assert_not_behind('test')
'(?<!test)'

TODO: actual example of using this

  • Return type

    str

superexpressive.at_least(length)

Match the previous pattern at least length times, greedily.

>>> import superexpressive as se
>>> se.at_least(4)
'{4,}'
>>> import superexpressive as se
>>> se.DIGIT + se.at_least(6)
'\\d{6,}'
  • Return type

    str

superexpressive.back_reference(index)

Refer to an earlier captured group by 1-based index.

>>> import superexpressive as se
>>> se.back_reference(2)
'\\2'

TODO: actual example of using this

  • Return type

    str

superexpressive.between(minl, maxl)

Match the previous pattern at between minl and maxl times, greedily.

>>> import superexpressive as se
>>> se.between(4,8)
'{4,8}'
>>> import superexpressive as se
>>> se.DIGIT + se.between(6,8)
'\\d{6,8}'
  • Return type

    str

superexpressive.capture(*args, name=None)

A group that captures its contents.

>>> import superexpressive as se
>>> se.capture(se.range(("a", "f"), ("0", "9")), 'XXX')
'([a-f0-9]XXX)'
  • Return type

    str

superexpressive.exactly(length)

Match the previous pattern exactly length times.

>>> import superexpressive as se
>>> se.exactly(4)
'{4}'
>>> import superexpressive as se
>>> se.DIGIT + se.exactly(6)
'\\d{6}'
  • Return type

    str

superexpressive.from_regex(pattern)

it would be cool to be provide a “labeling” function which could generate the code from a given regex, as part of a debugging suite

  • Return type

    str

superexpressive.group(*args)

A group that does not capture its contents.

>>> import superexpressive as se
>>> se.group(se.range(("a", "f"), ("0", "9")), 'XXX')
'(?:[a-f0-9]XXX)'
  • Return type

    str

superexpressive.named_back_reference(name)

Refer to an earlier captured group by name.

>>> import superexpressive as se
>>> se.named_back_reference('test')
'\\k<test>'

TODO: actual example of using this

  • Return type

    str

superexpressive.optional(*args)

A optional non-capturing group of the items inside.

>>> import superexpressive as se
>>> se.optional(se.DIGIT)
'(?:\\d)?'
  • Return type

    str

superexpressive.range(*args, negate=False)

An item that matches a range of characters by ascii code.

>>> import superexpressive as se
>>> se.range(('A', 'F'))
'[A-F]'
  • Return type

    str

superexpressive.re_flags_to_string(flags=0)

Turn a set of re flags into a string suitable for inclusion in a regex.

>>> import superexpressive as se
>>> se.re_flags_to_string(re.A)
'(?a)'
>>> import superexpressive as se
>>> se.re_flags_to_string(re.IGNORECASE | re.LOCALE)
'(?iL)'
>>> import superexpressive as se
>>> se.re_flags_to_string()
''
  • Return type

    str

superexpressive.to_regex(*args, flags=0, compile=True)

Turn a collection of strings into a regex.

If compile is True, return a re.compile object. If false, return a regex

string in the python style.
>>> import superexpressive as se
>>> se.to_regex(
...     se.START_OF_INPUT,
...     se.optional("0x"),
...     se.capture(se.range(("a", "f"), ("A", "F"), ("0", "9")), se.exactly(4)),
...     se.END_OF_INPUT,
...     compile=False
... )
'^(?:0x)?([a-fA-F0-9]{4})$'
>>> import superexpressive as se
>>> se.to_regex(compile=False)
''

TODO: More tests, like flags

  • Return type

    Union[str, compile]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

super-expressive-python-0.0.2.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

super_expressive_python-0.0.2-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file super-expressive-python-0.0.2.tar.gz.

File metadata

  • Download URL: super-expressive-python-0.0.2.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.9.0

File hashes

Hashes for super-expressive-python-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8ccf653537c5548b0bd7e570f07b4bd0ad92582b668eba66d8b0d9455f931a3e
MD5 d8ed4b0f3d99c9fa45e8f8ef03b85bfd
BLAKE2b-256 dfd564cdddfa81463d829edee6442de306a8138b01981dae5bde154ddf0760a8

See more details on using hashes here.

File details

Details for the file super_expressive_python-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: super_expressive_python-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.9.0

File hashes

Hashes for super_expressive_python-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 600aeb4d1f0139f476876dc66a7e9a3aaae14310753e19c81673f4adde8a47ec
MD5 879c596e15a4566abe5a0acee5bfca32
BLAKE2b-256 d38b6f8a6a696e832d531053db4fcb210fb79b6bad2c5ae1b42b58c8929c4841

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page