contex·PyPI

Contextual string manipulation

These details have not been verified by PyPI

Project links

Homepage

Project description

This library provides two related abstractions, StringContext and MatchContext.

The problem with our abstractions

I’ll present two “problems” that this library attempts to solve. The first one is rather contrived, the second one is more realistic. Afterwards I will show how contex can help.

Problem 1

You have a string such as "abcde" and you want to “surround” index 2 with parentheses. Thus ''.join([string[:2], "(", string[2], ")", string[2:]), or something similar. This is inelegant, bugprone, and hard to read! StringContex tries to solve this problem.

Problem 2

You have a bunch of files of the form Photo<number>.jpg. The only problem is that all the numbers are one too high, so that "Photo034.jpg" should actually be "Photo033.jpg". This is not a hard problem, or it should not be, but it doesn’t feel very good to solve it.

The almost-solutions

One attempt is to use re.sub for this. You could just do this:

>>> re.sub(r'([0-9]+)', lambda match: '{:0>3}'.format(int(match.group(1)) - 1),  'Photo034.jpg')
'Photo033.jpg'

But this is a fragile solution, because it can’t deal with more complicated filenames. What happens when you have filenames such as "Vacation2008Photo_034.jpg"? You can no longer do it. So you decide to “do it right”, and you end up with this:

>>> regex = r'(?<={})([0-9]+)(?={})'.format('Vacation2008Photo_', r'\.jpg')
>>> regex
'(?<=Vacation2008Photo_)([0-9]+)(?=\\.jpg)'
>>> re.sub(regex, lambda match: '{:0>3}'.format(int(match.group(1)) - 1), 'Vacation2008Photo_034.jpg')
'Vacation2008Photo_033.jpg'

What a wonderful sight! This works, but obviously isn’t desirable.

Contex to the rescue

It is my thesis that our abstractions aren’t fit for this sort of problem. The problems above “hit it where it hurts” so to speak, because in order to be solved elegantly they require context, and in the one dimensional world of strings this means: what came before? what came after? Which part of the string are we focusing on right now? This is exactly what StringContext is. It contains 3 parts: before, focus, after:

>>> import contex
>>> contex.T('Hello')
StringContext('', 'Hello', '')
>>> contex.T('abcde')[2:]
StringContext('ab', 'cde', '')
>>> contex.T('abcde')[2:][0]
StringContext('', 'a', 'bcde')
>>> contex.T('abcde')[2]
StringContext('ab', 'c', 'de')
>>> view = contex.T('abcde')[2]
>>> view.before, view.focus, view.after
('ab', 'c', 'de')
>>> view.replace(lambda focus: '({})'.format(focus))
StringContext('ab', '(c)', 'de')
>>> str(view.replace(lambda focus: '({})'.format(focus)))
'ab(c)de'

As you can see, slicing has the function of shifting the focus of the string. “I want to look at this part now”. These points are true of StringContext and MatchContext:

They are treated as immutable objects: methods that “change” stuff doesn’t mutate but returns a new version.
All methods operates on the full string, not merely the focus point. So StringContext.reverse doesn’t reverse the focus only, it reverses everything; StringContext.search searches everything, you get the picture.
The 3 composite parts are normal strings. I rejected the idea of a tree of StringContext objects because it seemed too complicated, more confusing than useful.
Methods that needs str arguments also accept StringContext arguments: it will be converted with str automatically.

MatchContext

MatchContext is what you get when you do regular expression searches. It’s a subclass of StringContext and contains information relevant to the match/search it was created for, namely the “span” - a (start, end) tuple of indices of the string, start and end are both referred to as points - of the various regex groups that happened to match. It also contains useful methods pertaining to these regex groups, like MatchContext.group and MatchContext.expand. Q: But what happens to the regex spans when you manipulate the string, for example with MatchContext.replace? A: They move around in sensible ways. The details can be found in the docstring for MatchContext.replace, but the gist of it is that if focus grows in length by 3 when you replace it, then any point at the very end or after focus also grows by 3. If the point is before focus then it stays the same. If it is in the middle of focus then it might “shrink” if focus becomes too small to contain it.

This is how you’d use it to solve problem number 2:

>>> contex.match('Vacation2008Photo_034.jpg', r'Vacation2008Photo_(?P<number>[0-9]+)\.jpg')
MatchContext('', 'Vacation2008Photo_034.jpg', '')
>>> m = contex.match('Vacation2008Photo_034.jpg', r'Vacation2008Photo_(?P<number>[0-9]+)\.jpg')
>>> m.group("number")
MatchContext('Vacation2008Photo_', '034', '.jpg')
>>> m.group('number').replace(lambda num: '{:0>3}'.format(int(num) - 1))
MatchContext('Vacation2008Photo_', '033', '.jpg')
>>> result = m.group('number').replace(lambda num: '{:0>3}'.format(int(num) - 1))
>>> str(result)
'Vacation2008Photo_033.jpg'

The .group method is like slicing, it says “I want to look at this part of the string now”.

Conclusion

This is not to say that there’s anything wrong with strings as we use them now, or that these abstractions can serve as a replacement. It’s rather to say that in solving certain problems they make you do dirty things, like fiddling around with indices, consequently making 1-off bugs, and so on. I’ve shown that contex can solve some problems like those above nicely. How often can this contextual abstraction be of use? I don’t really know.

Using Contex

The contex package contains 4 functions: T(string), search(string, pattern, flags=0), match(string, pattern, flags=0) and find(string, substring). T is for bringing a string into the world of contex by converting it into a StringContext object; search and match are for regex searches; find is for normal string search. contex also contains the StringContext and MatchContext classes.

Installing

Install with $ pip3 install contex. If Python3 is the default python on your system, you may replace pip3 with pip.

Developing

Contex is documented and tested. Run $ nosetests or $ python3 setup.py test to run the tests. The code is hosted at https://notabug.org/Uglemat/Contex

License

The library is licensed under the GNU General Public License 3 or later. This README file is public domain.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

3.1.1

Aug 23, 2015

3.1

Aug 23, 2015

3.0.2

Aug 20, 2015

3.0.1

Aug 17, 2015

Apr 14, 2015

2.0.1

Mar 29, 2015

2.0

Mar 29, 2015

1.2

Mar 28, 2015

This version

1.1

Mar 28, 2015

1.0

Mar 28, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contex-1.1.tar.gz (10.2 kB view details)

Uploaded Mar 28, 2015 Source

Built Distribution

contex-1.1-py3-none-any.whl (13.1 kB view details)

Uploaded Mar 28, 2015 Python 3

File details

Details for the file contex-1.1.tar.gz.

File metadata

Download URL: contex-1.1.tar.gz
Upload date: Mar 28, 2015
Size: 10.2 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for contex-1.1.tar.gz
Algorithm	Hash digest
SHA256	`080a11f5ef46d684396e28dcf450d57f58a9e5f2bb5c7dc6fdf3537174397982`
MD5	`f1d6b13ceacd8fc4b17ad9a0172196ee`
BLAKE2b-256	`61b08432a9b23c9ed116aaf6b06824405529a225fe37e255abd8481c17221a54`

See more details on using hashes here.

File details

Details for the file contex-1.1-py3-none-any.whl.

File metadata

Download URL: contex-1.1-py3-none-any.whl
Upload date: Mar 28, 2015
Size: 13.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for contex-1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2105a067592f05952f58f91d8b29ec7fcc9b013a06a722111daa7a6a56f9b391`
MD5	`50a034f36ed0632069a6123c3231f51d`
BLAKE2b-256	`b5c71c950eb713776a455b644c007632885965088af4851f5eaf7735d70f0853`

See more details on using hashes here.

contex 1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

The problem with our abstractions

Problem 1

Problem 2

The almost-solutions

Contex to the rescue

MatchContext

Conclusion

Using Contex

Installing

Developing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes