Skip to main content

Small library for extracting plots used in scholarly communication.

Project description


A small library for extracting plots used in scholarly communication.


$ pip install plotextractor


To extract plots from a tarball:

>>> from plotextractor import process_tarball
>>> plots = process_tarball('./1503.07589.tar.gz')
>>> print(plots[0])
    'captions': [
        u'Scans of twice the negative log-likelihood ratio $-2\\ln\\Lambda(\\mH)$ as functions of the Higgs boson mass \\mH\\ for the ATLAS and CMS combination of the \\hgg\\ (red), \\hZZllll\\ (blue), and combined (black) channels. The dashed curves show the results accounting for statistical uncertainties only, with all nuisance parameters associated with systematic uncertainties fixed to their best-fit values. The 1 and 2 standard deviation limits are indicated by the intersections of the horizontal lines at 1 and 4, respectively, with the log-likelihood scan curves.',
    'label': u'figure_LHC_combined_obs',
    'name': 'LHC_combined_obs_unblind_final',
    'original_url': './1503.07589.tar.gz_files/LHC_combined_obs_unblind_final.pdf',
    'url': './1503.07589.tar.gz_files/LHC_combined_obs_unblind_final.png',


If you experience frequent DelegateError errors you may need to update your version of GhostScript.




Version 0.2.2 (2019-10-24)

  • Try to preserve the ordering of figures.

Version 0.2.1 (2018-09-07)

  • Handle utf-8 in paths inside the tarball.

Version 0.2.0 (2018-02-07)

  • Ignore hidden/metadata files in the tarball.
  • Handle relative paths in the tarball.

Version 0.1.6 (2016-12-01)

  • Sets the mtime for all members of the tarball to current time before unpacking.

Version 0.1.5 (2016-05-25)

  • Properly raises an exception when no TeX files are found in an archive.
  • More fixes to image path extraction and more robust image handling.

Version 0.1.4 (2016-03-22)

  • Fixes linking images from TeX reference when images are referred to without specifying full relative folder path.

Version 0.1.3 (2016-03-17)

  • Properly supports cases where images are located in a nested folder inside the extracted tarballs root folder.

Version 0.1.2 (2015-12-08)

  • Adds wrapfigure support.
  • Catches problems with image conversions.
  • More robust handling of image rotations in TeX sources.
  • Removes unicode_literals usage.

Version 0.1.1 (2015-12-04)

  • Improves extraction from TeX file by reading files with encoding.

Version 0.1.0 (2015-10-19)

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for plotextractor, version 1.0.1
Filename, size File type Python version Upload date Hashes
Filename, size plotextractor-1.0.1.tar.gz (10.9 MB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page