Skip to main content

wikify your texts! micro-framework for text wikification

Project description

**wikify** your texts!
*micro-framework for text wikification*

goals - avoid conflicts between text modifications rules
and be easy to extend and debug

**author**: anatoly techtonik <>
**license**: Public Domain

[![Build Status](](

#### the problem and solution

this example is pasted from real-word replacement rules of
Roundup issue tracker:

>>> import re
>>> rules = [
# link to debian bug tracker
'<a href="\g<id>">debian#\g<id></a>' ),

# link to local issue
'<a href="issue\g<id>">#\g<id></a>' ),
>>> text = "debian:#222"
>>> for search, replace in rules:
... text = search.sub(replace, text)
>>> text
'<a href="">debian<a href="issue222">#222</a></a>'

expected output is:

'<a href="">debian#222</a>'

the solution:

>>> import wikify
>>> wrules = [wikify.RegexpRule(s,r) for s,r in rules]
>>> wikify.wikify("debian:#222", wrules)
'<a href="">debian#222</a>'

#### usage

1. define rules that match and process parts of text
2. text = wikify(text, rules)

`rule` is a function or an object run() method that takes text and
returns either `None` (means not matched) or this text split into
three parts [ not-matched, processed, the-rest ]. `processed` part
of text is returned modified by the rule.

example of a rule in action:

>>> import wikify
>>> wikify.rule_link_wikify('wikify your texts!')
('', '<a href="">wikify</a>', ' your texts!')

and its source code:

def rule_link_wikify(text):
""" replace `wikify` text with a link to repository """
if not 'wikify' in text:
return None
res = text.split('wikify', 1)
site = ''
url = '<a href="%s">wikify</a>' % site
return (res[0], url, res[1])

using the rule with wikify to get processed text:

>>> from wikify import wikify, rule_link_wikify
>>> wikify('wikify your texts!', rule_link_wikify)
'<a href="">wikify</a> your texts!'

you probably want change url and searched string, so to avoid
rewriting the rule from scratch, **wikify** provides some.

#### API

###### RegexpRule(search, replace=r'\0')
wikify rule class. `search` is regexp, `replace` can be string
with backreferences (like \0, \1 etc.) or a callable that receives

r = RegexpRule('(\d+)', '[\\1]')
print(wikify('wrap list 1 2 3 45', r))
# wrap list [1] [2] [3] [45]

in comparison to standard `re.sub`, RegexpRule expands \0 in
replacement template to the whole matched string.

###### tracker_link_rule(url)
chained function rule (function that returns list of rules) that
replaces references like #123, issue #123 with link to `url` with
issue number appended.

w = tracker_link_rule('')
print(wikify('issue #123, &#8121;', w))
# <a href="">issue #123</a>, &#8121;

###### wikify(text, rules)
`rules` argument can be a list of rules. **wikify** ensures that text
processed by one rule is not reachable by others. if you try to process
some text without **wikify** with just a series of replacement commands,
there can be situations when later replacement may affect the text just
pasted by previous one. **wikify** was made to prevent this from

#### using as a Sphinx extension
**wikify** is also a Sphinx extension. the following lines if added
to ``, will link issue numbers on `changes` page to bugtracker for
the `sphinx` project:

extensions = ['wikify']

# setup wikify extension to convert issue references to links
from wikify import RegexpRule, tracker_link_rule
wikify_html_rules = [
# PR#123 or pull request #123
RegexpRule('(PR|pull request\s)\s*#(\d+)',
'<a href="\\2">\\0</a>'),
# issue #123 or just #123
wikify_html_pages = ['changes']

#### operation (flat algorithm)
for each region
- find region in processed text
- process text matched by region
- exclude processed text from further processing

note: (flat algorithm) doesn't process nested markup,
such as:

*`bold preformatted text`*

example - replace all wiki:something with HTML links

- [x] wrap text into list with single item
- [x] split text into three parts using regexp `wiki:\w+`
- [x] copy 1st part (not-matched) into the resulting list
- [x] replace matched part with link, insert (processed)
into the resulting list
- [ ] process (the-rest) until text list doesn't change
- [x] repeat the above for the rest of rules, skipping
(processed) parts
- [x] reassemble text from the list

#### roadmap
- [ ] optimize - measure performance of using indexes
instead of text chunks
- [x] write docs
- [x] upload to PyPI

#### history
- 1.5 - fixed major flaw in subst order for single rule
- 1.4 - support named group replacements in RegexpRule
- 1.3 - create_tracker_link_rule to tracker_link_rule
- 1.2 - convert create_regexp_rule to RegexpRule class
- 1.1 - allow rules to be classes (necessary for Sphinx)
- 1.0 - use wikify as Sphinx extension

- 0.9 - case insensitive match in tracker link rule
- 0.8 - python 3 compatibility
- 0.7 - fixed major flaw in text replacements mapping
- 0.5 - helper to build rules to link tracker references
- 0.6 - flatten nested rule lists
- 0.4 - accept single rule in wikify in addition to list
- 0.3 - allow callables in replacements for regexp rules
- 0.2 - helper to build regexp based rules
- 0.1 - proof of concept, production ready, no API sugar and optimizations

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution (9.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page