Skip to main content

A Python filter to manage acronyms

Project description

pandoc-acronyms - A Pandoc filter for managing acronyms

pipeline status coverage report

Acronyms? WTF?

There is a convention in more precise writing to provide the full text of an acronym at first use. This can get difficult for example if the text of documents is split over multiple files, which makes it hard for the authors to know where the acronym will be used first. Such a task is best left to the computer. This is what the pandoc-acronyms filter is for. Authors list acronyms in a data file and then reference them in the text.

HOWTO: pip install pandoc-acronyms

Write a data file that contains your acronyms:

{
	"aba": {
		"shortform": "ABA",
		"longform": "a better acronym"
	},
	"bba": {
		"shortform": "BBA",
		"longform": "beer brewing attitude"
	}
}

Then in the text, use the acronym in encoded form like [!bba]. The filter will recognize it. On first use it replaces the marker with "beer brewing attitude (BBA)". Any later use will be replaced by "BBA". The filter will print a notice if an acronym is found in the text that is not defined in the data file. The keys start with a character and may consist of letters, numbers, dashes, the plus sign and underscores ("a-b", "ab", "a_b" or "c++", but not "++c").

The replacements will be made to the text and are independent of the selected output format. While tools such as pandoc-ac help users with generating acronym commands in LaTeX, pandoc-acronyms is designed to work directly on the document text.

Using acronyms in the input text

The most common way to write an acronym in the text is [!key]. To customize the output, the acronym specification can be made more specific:

  • [!+key] selects the plural form of the acronym.
  • [!^key] selects the uppercase form of the acronym. This only affects the long form, the abbreviated short form will not be changed.
  • [!+^key] For plural uppercase variants, plural must be specified first.

It is also possible to select which form should be inserted into the text (this can be combined with plural or uppercase selection):

  • [!key>] inserts the long form ("beer brewing attitude").
  • [!key<] inserts the short form ("BBA").
  • [!key!] inserts the explained form ("beer brewing attitude (BBA)").

Using the filter with pandoc

The filter mechanism is a built-in feature of pandoc. The filter is added to how pandoc is invoked:

> pandoc --filter pandoc-acronyms document.md

Pandoc does not allow to pass parameters to filters. The acronym filter needs to load the acronyms from the data file. To work around this, the parameters to the filter can be passed in environment variables:

> pandoc-acronyms --help
Usage: pandoc-acronyms [OPTIONS] [FORMAT]...

The pandoc-acronyms filter.

Options:
	-a, --acronyms TEXT           A file with acronym definitions in JSON
    		                      format.
	-v, --verbose / --no-verbose  Enable verbose output.
	-s, --suggest / --no-suggest  Suggest marking acronyms detected in the text.
	-e, --error / --no-error      Exit with an error if an undefined acronym is
								  used.
	-d, --debug / --no-debug      Enable debug output.
	--version                     Show the version and exit.
	--help                        Show this message and exit.

The environment variable PANDOC_ACRONYMS_ACRONYMS can be used to replace the --acronyms option. Similarly, the variable PANDOC_ACRONYMS_VERBOSE enables diagnostic output. All command line options of pandoc-acronyms can be controlled with environment variables the same way:

  • --acronyms: PANDOC_ACRONYMS_ACRONYMS
  • --verbose/--no-verbose: PANDOC_ACRONYMS_VERBOSE
  • --suggest/--no-suggest': PANDOC_ACRONYMS_SUGGEST
  • --error/--no-error: PANDOC_ACRONYMS_ERROR
  • --debug/--no-debug: PANDOC_ACRONYMS_DEBUG

Installation

The pandoc-acronyms program is released via the Python Package Index:

> pip install pandoc-acronyms
...

Alternatively, developers can clone the main repository and install using Python setuptools:

> python setup.py install
...

Once installed either way, the filter is available as a stand-alone program in the installation location used by Python.

Testing and debugging

The pandoc-acronyms code uses the standard Python unittest framework. Most tests are data-driven in that they use regular Markdown files and JSON acronym dictionaries as input and test how the code handles them. To test the filter code as regular Python unit tests, test Markdown input is first converted into the Pandoc "native JSON" format in memory and then fed to the filter code by the tests. This means the unit tests run stand-alone (without the need for Pandon to invoke them as a filter), making the test code easily debugable.

How to contribute

The Git repository for the pandoc acronym filter is hosted on Gitlab. It uses the Gitlab CI system to ensure quality, also for development branches and incoming merge requests. Deployment to PyPI is automated. Development branches and merge requests will be deployed to the PyPI test instance as development packages. Commits to master will be deployed to regular PyPI as development packages. Tagged versions on master are deployed to PyPI as stable releases.

To contribute, please submit a merge request. Your merge request should maintain or increase the test coverage.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandoc-acronyms-0.5.dev202312121715.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file pandoc-acronyms-0.5.dev202312121715.tar.gz.

File metadata

File hashes

Hashes for pandoc-acronyms-0.5.dev202312121715.tar.gz
Algorithm Hash digest
SHA256 37011782374a502b526700bd7b056b6513e4cfe575bbe99612c3d2badb9724bc
MD5 08b3d2332d7ce708b47eac9107804f73
BLAKE2b-256 255668d93fa1d9302dffd5fbc3dd16e6df4d5a2b48995f606f835d01ae63f9c5

See more details on using hashes here.

File details

Details for the file pandoc_acronyms-0.5.dev202312121715-py3-none-any.whl.

File metadata

File hashes

Hashes for pandoc_acronyms-0.5.dev202312121715-py3-none-any.whl
Algorithm Hash digest
SHA256 48b2d9fedd553e2c08d3f814756376b3e00227e43d72ce217786f4e03f753d2e
MD5 8016b55c384bd47cfbedb598180a28f9
BLAKE2b-256 9fdc9bb0248706bb12e1fd3e697728d1a85c4faeb0d3548cfdce670e2337cff7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page