Extract raw data from plots images
Project description
A Python3 command line utility to digitize plots in batch mode.
This utility is useful when you have a lot of similar plots such as EEG, ECG recordings. See examples below.
For one-off use cases, you will find web-based digitizer WebPlotDigitizer by Ankit Rohatagi much more easier to use.
Installation
$ python3 -m pip install plotdigitizer
$ plotdigitizer --help
Preparing image
Crop the image and leave only axis and trajectories. I use
gthumb
utility on Linux. You can also use imagemagick or gimp.
Following image is from MacFadden and Koshland, PNAS 1990 after trimming. One can also remove top and right axis.
Run
plotdigitizer ./figures/trimmed.png -p 0,0 -p 10,0 -p 0,1
We need at least three points (-p
option) to map axes onto the image. In the example
above, these are 0,0
(where x-axis and y-axis intesect) , 10,0
(a point on
x-axis) and 0,1
(a point on y-axis). To map these points on the image, you
will be asked to click on these points on the image. Make sure to click in
the same order and click on the points as precisely as you could. Any error in
this step will propagate. If you don't have 0,0
in your image, you have to provide
4 points: 2 on x-axis and 2 on y-axis.
The data-points will be dumped to a csv file specified by --output /path/to/file.csv
.
If --plot output.png
is passed, a plot of the extracted data-points will be
saved to output.png
. This requires matplotlib
. Very useful when debugging/testing.
Notice the error near the right y-axis.
Using in batch mode
You can pass the coordinates of points in the image at the command prompt. This allows to run in the batch mode without any need for the user to click on the image.
plotdigitizer ./figures/trimmed.png -p 0,0 -p 20,0 -p 0,1 -l 22,295 -l 142,295 -l 22,215 --plot output.png
How to find coordinates of axes points
In the example above, point 0,0
is mapped to coordinate 22,295
i.e., the
data point 0,0
is on the 22nd row and 295th column of the image (assuming that bottom left
of the image is first row, first column (0,0)
). I have included an utility
plotdigitizer-locate
(script plotdigitizer/locate.py
) which you can use to
find the coordinates of points.
$ plotdigitizer-locate figures/trimmed.png
or, by directly using the script:
$ python3 plotdigitizer/locate.py figures/trimmed.png
This command opens the image in a simple window. You can click on a point and its coordinate will be written on the image itself. Note them down.
Examples
Base examples
plotdigitizer figures/graphs_1.png \
-p 1,0 -p 6,0 -p 0,3 \
-l 165,160 -l 599,160 -l 85,60 \
--plot figures/graphs_1.result.png \
--preprocess
Light grids
plotdigitizer figures/ECGImage.png \
-p 1,0 -p 5,0 -p 0,1 \
-l 290,337 -l 1306,338 -l 106,83 \
--plot figures/ECGImage.result.png
With grids
plotdigitizer figures/graph_with_grid.png \
-p 200,0 -p 1000,0 -p 0,50 \
-l 269,69 -l 1789,69 -l 82,542 \
--plot figures/graph_with_grid.result.png
Image credit: Yang yi, Wang
Note that legend was not removed in the original figure and it has screwed up the detection below it.
Limitations
This application has following limitations:
- Only b/w images are supported for now. Color images will be converted to grayscale upon reading.
- Each plot should have only one trajectory.
Need help
Open an issue and please attach the sample plot.
Related Projects
- WebPlotDigitizer by Ankit Rohatagi is very versatile.
Notes
- grapvhiz version 2.47.2 is broken for some xml files. See https://forum.graphviz.org/t/assert-sz-2-in-convertsptoroute/689. Please use a different version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file plotdigitizer-0.3.0.tar.gz
.
File metadata
- Download URL: plotdigitizer-0.3.0.tar.gz
- Upload date:
- Size: 21.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.12.2 Windows/11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5b7188c98cfed0033d1e601f72dc189c1cee7412fa374533d9f6c585bb74f16 |
|
MD5 | dd5a4cdad8fc1be6f485ac08fccb32ba |
|
BLAKE2b-256 | ff805d666722c8a078fd41bc57825858791d950e38bdfafbd74a3f04e150e46d |
File details
Details for the file plotdigitizer-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: plotdigitizer-0.3.0-py3-none-any.whl
- Upload date:
- Size: 24.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.12.2 Windows/11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0fa2dc3654acaa69c7e457114d504227d328ec214e21234209d6e7905ae8238e |
|
MD5 | 35bf5fde4a113428c237d98c68edf2b1 |
|
BLAKE2b-256 | 6d628094cff50892984857fa6385c078195f2763ee57356985e41006fb7863ae |