A toolkit for extracting particle data from microscopy images.
Project description
# ImageDataExtractor
ImageDataExtractor is a toolkit for the automatic extraction of microscopy images.
## Features
Automatic detection and download of microscopy images from scientific articles
HTML and XML document format support
High-throughput capabilities
Direct extraction from image files
PNG, GIF, JPEG, TIFF image format support
## Installation
It is best to install ImageDataExtractor using pip, but it is also possible to directly install from source. See below for installation instructions.
__NOTE: The current version of IDE uses Tesseract 3. The source code can be downloaded [here](https://github.com/tesseract-ocr/tesseract/tree/3.05) and instructions on how to compile can be found [here](https://github.com/tesseract-ocr/tesseract/wiki/Compiling).__
NOTE: It is advised that all installations of ImageDataExtractor are run inside a virtual environment. Click [here](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) for more information.
### Installing with pip
To install with pip, simply run:
pip install ImageDataExtractor
Then download the necessary data files to run ChemDataExtractor-IDE by running:
cde data download
### Installing from source
#### Install ChemDataExtractor-IDE
In order to use ImageDataExtractor first install the bespoke version of ChemDataExtractor, [ChemDataExtractor-IDE](https://github.com/edbeard/chemdataextractor-ide).
Clone the repository by running:
$ git clone https://github.com/edbeard/chemdataextractor-ide.git
and install with:
$ python setup.py install
Then download the required machine learning models with:
$ cde data download
See https://github.com/edbeard/chemdataextractor-ide for more details
#### Install ImageDataExtractor
Now to install ImageDataExtractor, clone the repository with:
$ git clone https://github.com/ktm2/ImageDataExtractor.git
Then create a wheel file by running:
$ python setup.py bdist_wheel
You may have to run `pip install wheel` if this fails.
Then install using pip:
$ pip install dist/ImageDataExtractor-0.0.1-py3-none-any.whl
## Running the code
__Full documentation on running the code can be found at [www.imagedataextractor.org](https://www.imagedataextractor.org) .__
Open a python terminal and run
>>> import imagedataextractor as ide
Then run:
>>> ide.extract_document(<path/to/document>)
to automatically identify and extract the images from a document. Full details on supported input and output formats can be found at our [website](https://www.imagedataextractor.org) .
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ImageDataExtractor-1.0.1.tar.gz.
File metadata
- Download URL: ImageDataExtractor-1.0.1.tar.gz
- Upload date:
- Size: 191.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.5.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71915aa59e3e70e3e1f1614f89688e1a398576af3d1b042b7e1b6fc80b520f8c
|
|
| MD5 |
0c00da6a71652135cc1bbb20c10c0b0e
|
|
| BLAKE2b-256 |
2882a8575e448f7cb83527ad3d75678bf3ca8fc8eccccb2450b96adf137b8965
|