Skip to main content

A tiny web type setter

Project description

degrotesque - a tiny web type setter

The script loads a HTML page - or several in batch, one after the other - and for each, it replaces some commonly used non-typographic characters, such as ", ', -, etc. into their typographic representant for improving the pages' appearance.

E.g.:

"Well - that's not what I had expected."

will become:

“Well — that's not what I had expected.”

(Uhm, uhm, for those who don't see it, the starting and ending quotes have been replaced by “ and ”, respectively, the ' by & and the - by a —.)

The script has the following options:

  • --input/-i: the file or the folder to process
  • --recursive/-r: Set if the folder - if given - shall be processed recursively
  • --no-backup/-B: Set if no backup files shall be generated
  • --actions/-a: Name the actions that shall be applied
  • --extensions/-e: The extensions of files that shall be processed

All of the text is replaced. This means everything not within a < and a >. But yes, the script is smart enough to skip the contents of the elements "pre", "style", "script", "code", and "<?".

The default actions are: quotes.english, dashes, ellipsis, math, apostroph. The list of all implemented actions is given below, as well as the default extensions of files that will be passed if a folder is given.

There are some caveats, yes:

  • If you embed HTML code in HTML (not suported by HTML, but who cares), it may yield in odd behaviour.
  • If you have php-pages and combine php-generated and plain HTML text, it may yield in odd behaviour. Etc. So you should check your pages for correctness after applying degrotesque.

degrotesque is licensed under the GPL v3.0.

Well, have fun. If you have any questions or comments, let me know.

Named Actions

The following action sets are currently implemented.

Action Name From Opening String From Closing String To Opening String To Closing String
quotes.english " '" "'" " ‘" "’"
""" """ "“" "”"
quotes.french "<" ">" "‹" "›"
"<<" ">>" "«" "»"
quotes.german " '" "'" " ‚" "’"
""" """ "„" "”"
to_quotes " '" "'" " <q>" "</q>"
""" """ "<q>" "</q>"
"<<" ">>" "<q>" "</q>"
commercial "(c)" "©"
"(C)" "©"
"(r)" "®"
"(R)" "®"
"(tm)" "™"
"(TM)" "™"
dashes " - " "—"
bullets "*" "•"
ellipsis "..." "…"
apostrophe "'" "'"
math "+/-" "±"
"1/2" "½"
"1/4" "¼"
"~" "≈"
"!=" "≠"
"<=" "≤"
">=" "≥"
dagger "**" "‡"
"*" "†"

Default Extensions

Files with the following extensions are parsed per default:

  • html, htm, xhtml,
  • php, phtml, phtm, php2, php3, php4, php5,
  • asp,
  • jsp, jspx,
  • shtml, shtm, sht, stm,
  • vbhtml,
  • ppthtml,
  • ssp, jhtml

Notes

  • I tried Genshi, BeautifulSoup, and lxml. All missed in keeping the code unchanged. So the parser just skips HTML-elements and the contents of some special elements, see above. Works in most cases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

degrotesque-0.4.tar.gz (5.8 kB view hashes)

Uploaded Source

Built Distribution

degrotesque-0.4-py2-none-any.whl (18.3 kB view hashes)

Uploaded Python 2

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page