Skip to main content

Regular expressions for objects

Project description

Lacking a proper name, REfO stands for “Regular Expressions for Objects”.

It’s a python library that supplies a functionality very similar to the python re module (regular expressions) but for arbitrary sequences of objects instead of strings (sequences of characters).

In addition to that, it’s possible to match each object in a sequence with not only equality, but an arbitrary python function. For example, if you have a sequence of integers you can make a regular expression that asks for a even number followed by a prime number followed by a 3-divisible number.

This software was written by Rafael Carrascosa while working at Machinalis in the first months of 2012.

Contact: rcarrascosa@machinalis.com or rafacarrascosa xyz gmail.com (replace “ xyz “ with “@”)

How to use it

The syntax is a little bit different than python’s re, and similar to that of pyparsing, you have to more-or-less explicitly build the syntax tree of your regular expression. For instance:

“ab” is Literal(“a”) + Literal(“b”)

“a*” is Star(Literal(“a”))

“(ab)+|(bb)*?” is:

a = Literal(“a”) b = Literal(“b”) regex = Plus(a + b) | Star(b + b, greedy=False)

You can also assign a group to any sub-match and later on retrieve the matched content, for instance:

regex = Group(Plus(a + b), “foobar”) | (b + b) m = match(regex, “abab”) print m.span(“foobar”) # prints (0, 4)

For more, check out the examples in the examples folder.

How we use it

At Machinalis we use REfO for applications similar to that in examples/words.py, check it out!

About the implementation

I use a Thompson-like virtual machine aproach, which ensures polynomial time worst-case complexity. See examples/poly_time.py for an example of this.

The implementation is heavily based on Russ Cox notes, see http://swtch.com/~rsc/regexp/regexp2.html for the source.

If you go to read the code, some glossary:

  • RE – regular expression

  • VM – virtual machine

  • Epsilon transitions – All VM instructions that do not consume a symbol

    or stop the thread (for example an Accept).

Acknowledgements

Thanks Russ Cox for sharing the awesome info and insights on your web site.

Thanks Javier Mansilla for reviewing the code and being enthusiastic about it.

Thanks Machinalis for everything :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

REfO-0.13.tar.gz (12.0 kB view details)

Uploaded Source

File details

Details for the file REfO-0.13.tar.gz.

File metadata

  • Download URL: REfO-0.13.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for REfO-0.13.tar.gz
Algorithm Hash digest
SHA256 f5c9fd5d6ae083b15071a27ebdf02353c7af9ecc92e22b67f961ecdf5e0355f4
MD5 0548a33f4713d2d2f8932195df918de6
BLAKE2b-256 51b1fe3af8c96aa634a9502bec5622aadc2e27880843a1da2c5869435772b075

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page