Skip to main content

HErbarium Specimen sheet PIpeline

Project description

https://raw.githubusercontent.com/rbturnbull/hespi/main/docs/images/hespi-banner.svg

pypi badge testing badge coverage badge docs badge black badge

HErbarium Specimen sheet PIpeline

Hespi takes images of specimen sheets from herbaria and first detects the various components of the sheet.

Hespi pipeline

Hespi first takes a specimen sheet and detects the various components of it using the Sheet-Component Model. Then any full database label detected is cropped and this is given to the Label-Field Model which detects different textual fields written on the label. A Label Classifier is also used to determine the type of text written on the label. If it is printed or typewritten, then the text of each field is given to an Optical Character Recognition (OCR) engine and if there is handwriting, then each field is given to the Handwritten Text Recognition (HTR) engine. The recognized text is then corrected using a multimodal Large Language Model (LLM). Finally, the result of the fields is post-processed before being written into an HTML report, a CSV file and text files.

The stages of the pipeline are explained in the documentation for the pipeline.

Installation

Install hespi using pip:

pip install hespi

The first time it runs, it will download the required model weights from the internet.

It is recommended that you also install Tesseract so that this can be used in the text recognition part of the pipeline.

To install the development version, see the documentation for contributing.

Usage

To run the pipeline, use the executable hespi and give it any number of images:

hespi image1.jpg image2.jpg

By default the output will go to a directory called hespi-output. You can set the output directory with the command with the --output-dir argument:

hespi images/*.tif --output-dir ./hespi-output

The detected components and text fields will be cropped and stored in the output directory. There will also be a CSV file with the filename hespi-results.csv in the output directory with the text recognition results for any institutional labels found.

By default hespi will use OpenAI’s gpt-4o large language model (LLM) in the pipeline to produce the final results. If you wish to use a different model from OpenAI or Anthropic, add it on the command-line like this: --llm MODEL_NAME You will need to include an API key for the LLM. This can be OPENAI_API_KEY for an OpenAI LLM or ANTHROPIC_API_KEY for Anthropic. You can also pass the API key to hespi with the --llm-api-key API_KEY argument.

More information on the command line arguments can be found in the Command Line Reference in the documentation.

There is another command line utility called hespi-tools which provides additional functionality. See the documentation for more information.

Training with custom data

To train the model with custom data, see the documention.

Credits

Robert Turnbull, Emily Fitzgerald, Karen Thompson and Jo Birch from the University of Melbourne.

This research was supported by The University of Melbourne’s Research Computing Services and the Petascale Campus Initiative. The authors thank collaborators Niels Klazenga, Heroen Verbruggen, Nunzio Knerr, Noel Faux, Simon Mutch, Babak Shaban, Andrew Drinnan, Michael Bayly and Hannah Turnbull.

Plant refererence data obtained from the Australian National Species List (auNSL), as of March 2024, using the:

  • Australian Plant Name Index (APNI)

  • Australian Bryophyte Name Index (AusMoss)

  • Australian Fungi Name Index (AFNI)

  • Australian Lichen Name Index (ALNI)

  • Australian Algae Name Index (AANI)

and the World Flora Online Taxonomic Backbone v.2023.12, accessed 13 June 2024.

This pipeline depends on YOLOv8, torchapp, Microsoft’s TrOCR.

Logo derived from artwork by ka reemov.

See the documentation for more information for references in BibTeX format or use the command:

hespi-tools bibtex

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hespi-0.5.0.tar.gz (2.8 MB view details)

Uploaded Source

Built Distribution

hespi-0.5.0-py3-none-any.whl (2.9 MB view details)

Uploaded Python 3

File details

Details for the file hespi-0.5.0.tar.gz.

File metadata

  • Download URL: hespi-0.5.0.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.8.0-1014-azure

File hashes

Hashes for hespi-0.5.0.tar.gz
Algorithm Hash digest
SHA256 59ff1eb4eec34964a078b89cf3a77e60f687d50bb0d617bf50d5a4fe0c742581
MD5 55996e5f4084b2e41fe863476b617b32
BLAKE2b-256 d0129262de387df9b4d744dc9177a7eb34bbb3ed0b4944b602fb256fdaff4c07

See more details on using hashes here.

File details

Details for the file hespi-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: hespi-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 2.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.8.0-1014-azure

File hashes

Hashes for hespi-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1b6bea625089fa38c26fb6f89b8c00be933fbbe8f241fb4aa7c5e7e28a24df75
MD5 53890d0e14c33918872e025ca1e4f47b
BLAKE2b-256 5aeeaca8bffeb69202b10260e57ef064c1e797045f2d06289297c7e16a3623d8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page