latexcodec

A lexer and codec to work with LaTeX code in Python.

These details have not been verified by PyPI

Project links

Project description

Instead of using latexcodec, I encourage you to consider pylatexenc instead, which is far superior: https://github.com/phfaist/pylatexenc
Download: http://pypi.python.org/pypi/latexcodec/#downloads
Documentation: http://latexcodec.readthedocs.org/
Development: http://github.com/mcmtroffaes/latexcodec/

The codec provides a convenient way of going between text written in LaTeX and unicode. Since it is not a LaTeX compiler, it is more appropriate for short chunks of text, such as a paragraph or the values of a BibTeX entry, and it is not appropriate for a full LaTeX document. In particular, its behavior on the LaTeX commands that do not simply select characters is intended to allow the unicode representation to be understandable by a human reader, but is not canonical and may require hand tuning to produce the desired effect.

The encoder does a best effort to replace unicode characters outside of the range used as LaTeX input (ascii by default) with a LaTeX command that selects the character. More technically, the unicode code point is replaced by a LaTeX command that selects a glyph that reasonably represents the code point. Unicode characters with special uses in LaTeX are replaced by their LaTeX equivalents. For example,

original text	encoded LaTeX
¥	\yen
ü	\"u
\N{NO-BREAK SPACE}	~
~	\textasciitilde
%	\%
#	\#
\textbf{x}	\textbf{x}

The decoder does a best effort to replace LaTeX commands that select characters with the unicode for the character they are selecting. For example,

original LaTeX	decoded unicode
\yen	¥
\"u	ü
~	\N{NO-BREAK SPACE}
\textasciitilde	~
\%	%
\#	#
\textbf{x}	\textbf {x}
#	#

In addition, comments are dropped (including the final newline that marks the end of a comment), paragraphs are canonicalized into double newlines, and other newlines are left as is. Spacing after LaTeX commands is also canonicalized.

For example,

hi % bye
there\par world
\textbf     {awesome}

is decoded as

hi there

world
\textbf {awesome}

When decoding, LaTeX commands not directly selecting characters (for example, macros and formatting commands) are passed through unchanged. The same happens for LaTeX commands that select characters but are not yet recognized by the codec. Either case can result in a hybrid unicode string in which some characters are understood as literally the character and others as parts of unexpanded commands. Consequently, at times, backslashes will be left intact for denoting the start of a potentially unrecognized control sequence.

Given the numerous and changing packages providing such LaTeX commands, the codec will never be complete, and new translations of unrecognized unicode or unrecognized LaTeX symbols are always welcome.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.0.1

Jun 17, 2025

3.0.0

Mar 6, 2024

2.0.1

Jun 23, 2020

2.0.0

Jan 14, 2020

1.0.7

May 3, 2019

1.0.6

Jan 18, 2019

1.0.5

Jun 16, 2017

1.0.4

Sep 21, 2016

1.0.3

Mar 26, 2016

1.0.2

Mar 1, 2016

1.0.1

Sep 24, 2014

1.0.0

Aug 5, 2014

0.3.2

Apr 17, 2014

0.3.1

Feb 5, 2014

0.3.0

Aug 19, 2013

0.2

Sep 28, 2012

0.1

May 26, 2011

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

latexcodec-3.0.1.tar.gz (31.2 kB view details)

Uploaded Jun 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

latexcodec-3.0.1-py3-none-any.whl (18.5 kB view details)

Uploaded Jun 17, 2025 Python 3

File details

Details for the file latexcodec-3.0.1.tar.gz.

File metadata

Download URL: latexcodec-3.0.1.tar.gz
Upload date: Jun 17, 2025
Size: 31.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.0rc1

File hashes

Hashes for latexcodec-3.0.1.tar.gz
Algorithm	Hash digest
SHA256	`e78a6911cd72f9dec35031c6ec23584de6842bfbc4610a9678868d14cdfb0357`
MD5	`367fc595b5bd808721d858947e3b44d1`
BLAKE2b-256	`27dd4270b2c5e2ee49316c3859e62293bd2ea8e382339d63ab7bbe9f39c0ec3b`

See more details on using hashes here.

File details

Details for the file latexcodec-3.0.1-py3-none-any.whl.

File metadata

Download URL: latexcodec-3.0.1-py3-none-any.whl
Upload date: Jun 17, 2025
Size: 18.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.0rc1

File hashes

Hashes for latexcodec-3.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a9eb8200bff693f0437a69581f7579eb6bca25c4193515c09900ce76451e452e`
MD5	`7f11aadb2f5f0ccb5c978bf9ab22a2d7`
BLAKE2b-256	`b54023569737873cc9637fd488606347e9dd92b9fa37ba4fcda1f98ee5219a97`

See more details on using hashes here.

latexcodec 3.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes