Thin wrapper for pandoc.
Project description
pypandoc provides a thin wrapper for pandoc, a universal document converter.
Installation
Install pandoc
Ubuntu/Debian: sudo apt-get install pandoc
Fedora/Red Hat: sudo yum install pandoc
Mac OS X with Homebrew: brew install pandoc
Machine with Haskell: cabal-install pandoc
Windows: There is an installer available here
pip install pypandoc
To use pandoc filters, you must have the relevant filter installed on your machine
Usage
The basic invocation looks like this: pypandoc.convert('input', 'output format'). pypandoc tries to infer the type of the input automatically. If it’s a file, it will load it. In case you pass a string, you can define the format using the parameter. The example below should clarify the usage:
import pypandoc
output = pypandoc.convert('somefile.md', 'rst')
# alternatively you could just pass some string to it and define its format
output = pypandoc.convert('#some title', 'rst', format='md')
# output == 'some title\r\n==========\r\n\r\n'
If you pass in a string (and not a filename), convert expects this string to be unicode or utf-8 encoded bytes. convert will always return a unicode string.
It’s also possible to directly let pandoc write the output to a file. This is the only way to convert to some output formats (e.g. odt, docx, epub, epub3). In that case convert() will return an empty string.
import pypandoc
output = pypandoc.convert('somefile.md', 'docx', outputfile="somefile.docx")
assert output == ""
In addition to format, it is possible to pass extra_args. That makes it possible to access various pandoc options easily.
output = pypandoc.convert(
'<h1>Primary Heading</h1>',
'md', format='html',
extra_args=['--atx-headers'])
# output == '# Primary Heading\r\n'
output = pypandoc.convert(
'# Primary Heading',
'html', format='md',
extra_args=['--base-header-level=2'])
# output == '<h2 id="primary-heading">Primary Heading</h2>\r\n'
pypandoc now supports easy addition of pandoc filters.
filters = ['pandoc-citeproc']
pdoc_args = ['--mathjax',
'--smart']
output = pd.convert(source=filename,
to='html5',
format='md',
extra_args=pdoc_args,
filters=filters)
Please pass any filters in as a list and not a string.
Please refer to pandoc -h and the official documentation for further details.
Getting Pandoc Version
As it can be useful sometimes to check what Pandoc version is available at your system, pypandoc provides an utility for this. Example:
version = pypandoc.get_pandoc_version()
Contributing
Contributions are welcome. When opening a PR, please keep the following guidelines in mind:
Before implementing, please open an issue for discussion.
Make sure you have tests for the new logic.
Make sure your code passes flake8 pypandoc.py tests.py
Add yourself to contributors at README.md unless you are already there. In that case tweak your contributions.
Note that for citeproc tests to pass you’ll need to have pandoc-citeproc installed.
IMPORTANT! Currently Travis build is a bit broken. If you have any idea on how to debug that, please see #55.
Contributors
Valentin Haenel - String conversion fix
Daniel Sanchez - Automatic parsing of input/output formats
Thomas G. - Python 3 support
Ben Jao Ming - Fail gracefully if pandoc is missing
Ross Crawford-d’Heureuse - Encode input in UTF-8 and add Django example
Michael Chow - Decode output in UTF-8
Janusz Skonieczny - Support Windows newlines and allow encoding to be specified.
gabeos - Fix help parsing
Marc Abramowitz - Make setup.py fail hard if pandoc is missing, Travis, Dockerfile, PyPI badge, Tox, PEP-8, improved documentation
Daniel L. - Add extra_args example to README
Amy Guy - Exception handling for unicode errors
Florian Eßer - Allow Markdown extensions in output format
Philipp Wendler - Allow Markdown extensions in input format
Jan Schulz - Handling output to a file, Travis to work on newer version of Pandoc, return code checking, get_pandoc_version
Aaron Gonzales - Added better filter handling
David Lukes - Enabled input from non-plain-text files and made sure tests clean up template files correctly if they fail
License
pypandoc is available under MIT license. See LICENSE for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pypandoc-0.9.9.tar.gz
.
File metadata
- Download URL: pypandoc-0.9.9.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82b14ae04bab3e8db4c49d47ccc2a63f3a0fb9369b3671d8faedf219fb297095 |
|
MD5 | cd9ba12f1e4540d94e2c328e79570e96 |
|
BLAKE2b-256 | 1991b0920d87fe3975dd7de3fbc165596127176a2aa333dc1b05dde1242c80be |