Skip to main content

A toolkit for extracting particle data from microscopy images.

Project description

# ImageDataExtractor

ImageDataExtractor is a toolkit for the automatic extraction of microscopy images.

## Features

  • Automatic detection and download of microscopy images from scientific articles

  • HTML and XML document format support

  • High-throughput capabilities

  • Direct extraction from image files

  • PNG, GIF, JPEG, TIFF image format support

## Installation

It is best to install ImageDataExtractor using pip, but it is also possible to directly install from source. See below for installation instructions.

__NOTE: The current version of IDE uses Tesseract 3. The source code can be downloaded [here](https://github.com/tesseract-ocr/tesseract/tree/3.05) and instructions on how to compile can be found [here](https://github.com/tesseract-ocr/tesseract/wiki/Compiling).__

NOTE: It is advised that all installations of ImageDataExtractor are run inside a virtual environment. Click [here](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) for more information.

### Installing with pip

To install with pip, simply run:

pip install ImageDataExtractor

Then download the necessary data files to run ChemDataExtractor-IDE by running:

cde data download

### Installing from source

#### Install ChemDataExtractor-IDE

In order to use ImageDataExtractor first install the bespoke version of ChemDataExtractor, [ChemDataExtractor-IDE](https://github.com/edbeard/chemdataextractor-ide).

Clone the repository by running:

$ git clone https://github.com/edbeard/chemdataextractor-ide.git

and install with:

$ python setup.py install

Then download the required machine learning models with:

$ cde data download

See https://github.com/edbeard/chemdataextractor-ide for more details

#### Install ImageDataExtractor

Now to install ImageDataExtractor, clone the repository with:

$ git clone https://github.com/ktm2/ImageDataExtractor.git

Then create a wheel file by running:

$ python setup.py bdist_wheel

You may have to run `pip install wheel` if this fails.

Then install using pip:

$ pip install dist/ImageDataExtractor-0.0.1-py3-none-any.whl

## Running the code

__Full documentation on running the code can be found at [www.imagedataextractor.org](https://www.imagedataextractor.org) .__

Open a python terminal and run

>>> import imagedataextractor as ide

Then run:

>>> ide.extract_document(<path/to/document>)

to automatically identify and extract the images from a document. Full details on supported input and output formats can be found at our [website](https://www.imagedataextractor.org) .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ImageDataExtractor-1.0.1.tar.gz (191.2 kB view details)

Uploaded Source

File details

Details for the file ImageDataExtractor-1.0.1.tar.gz.

File metadata

  • Download URL: ImageDataExtractor-1.0.1.tar.gz
  • Upload date:
  • Size: 191.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.5.2

File hashes

Hashes for ImageDataExtractor-1.0.1.tar.gz
Algorithm Hash digest
SHA256 71915aa59e3e70e3e1f1614f89688e1a398576af3d1b042b7e1b6fc80b520f8c
MD5 0c00da6a71652135cc1bbb20c10c0b0e
BLAKE2b-256 2882a8575e448f7cb83527ad3d75678bf3ca8fc8eccccb2450b96adf137b8965

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page