Skip to main content

Extract raw data from plots images

Project description

Python application PyPI version DOI

A Python3 command line utility to digitize plots in batch mode.

This utility is useful when you have a lot of similar plots such as EEG, ECG recordings. See examples below.

For one-off use cases, you will find web-based digitizer WebPlotDigitizer by Ankit Rohatagi much more easier to use.

Installation

$ python3 -m pip install plotdigitizer
$ plotdigitizer --help

Preparing image

Crop the image and leave only axis and trajectories. I use gthumb utility on Linux. You can also use imagemagick or gimp.

Following image is from MacFadden and Koshland, PNAS 1990 after trimming. One can also remove top and right axis.

Trimmed image

Run

plotdigitizer ./figures/trimmed.png -p 0,0 -p 10,0 -p 0,1

We need at least three points (-p option) to map axes onto the image. In the example above, these are 0,0 (where x-axis and y-axis intesect) , 10,0 (a point on x-axis) and 0,1 (a point on y-axis). To map these points on the image, you will be asked to click on these points on the image. Make sure to click in the same order and click on the points as precisely as you could. Any error in this step will propagate. If you don't have 0,0 in your image, you have to provide 4 points: 2 on x-axis and 2 on y-axis.

The data-points will be dumped to a csv file specified by --output /path/to/file.csv.

If --plot output.png is passed, a plot of the extracted data-points will be saved to output.png. This requires matplotlib. Very useful when debugging/testing.

Notice the error near the right y-axis.

Using in batch mode

You can pass the coordinates of points in the image at the command prompt. This allows to run in the batch mode without any need for the user to click on the image.

plotdigitizer ./figures/trimmed.png -p 0,0 -p 20,0 -p 0,1 -l 22,295 -l 142,295 -l 22,215 --plot output.png

How to find coordinates of axes points

In the example above, point 0,0 is mapped to coordinate 22,295 i.e., the data point 0,0 is on the 22nd row and 295th column of the image (assuming that bottom left of the image is first row, first column (0,0)). I have included an utility plotdigitizer-locate (script plotdigitizer/locate.py) which you can use to find the coordinates of points.

$ plotdigitizer-locate figures/trimmed.png

or, by directly using the script:

$ python3 plotdigitizer/locate.py figures/trimmed.png

This command opens the image in a simple window. You can click on a point and its coordinate will be written on the image itself. Note them down.

Examples

Base examples

plotdigitizer figures/graphs_1.png \
		-p 1,0 -p 6,0 -p 0,3 \
		-l 165,160 -l 599,160 -l 85,60 \
		--plot figures/graphs_1.result.png \
		--preprocess

original reconstructed

Light grids

plotdigitizer  figures/ECGImage.png \
		-p 1,0 -p 5,0 -p 0,1 \
        -l 290,337 -l 1306,338 -l 106,83 \
		--plot figures/ECGImage.result.png

original reconstructed

With grids

plotdigitizer  figures/graph_with_grid.png \
		-p 200,0 -p 1000,0 -p 0,50 \
        -l 269,69 -l 1789,69 -l 82,542 \
		--plot figures/graph_with_grid.result.png

original Image credit: Yang yi, Wang

reconstructed

Note that legend was not removed in the original figure and it has screwed up the detection below it.

Limitations

This application has following limitations:

  • Only b/w images are supported for now. Color images will be converted to grayscale upon reading.
  • Each plot should have only one trajectory.

Need help

Open an issue and please attach the sample plot.

Related Projects

  1. WebPlotDigitizer by Ankit Rohatagi is very versatile.

Notes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plotdigitizer-0.3.0.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

plotdigitizer-0.3.0-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file plotdigitizer-0.3.0.tar.gz.

File metadata

  • Download URL: plotdigitizer-0.3.0.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.12.2 Windows/11

File hashes

Hashes for plotdigitizer-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c5b7188c98cfed0033d1e601f72dc189c1cee7412fa374533d9f6c585bb74f16
MD5 dd5a4cdad8fc1be6f485ac08fccb32ba
BLAKE2b-256 ff805d666722c8a078fd41bc57825858791d950e38bdfafbd74a3f04e150e46d

See more details on using hashes here.

File details

Details for the file plotdigitizer-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: plotdigitizer-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.12.2 Windows/11

File hashes

Hashes for plotdigitizer-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0fa2dc3654acaa69c7e457114d504227d328ec214e21234209d6e7905ae8238e
MD5 35bf5fde4a113428c237d98c68edf2b1
BLAKE2b-256 6d628094cff50892984857fa6385c078195f2763ee57356985e41006fb7863ae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page