Skip to main content

Client for VoucherVisionGO API

Project description

VoucherVisionGO Client

This repository contains only the client component of VoucherVisionGO, a tool for automatic label data extraction from museum specimen images.

Purpose

This repository is designed for users who only need the client component without the full VoucherVisionGO codebase, allowing for:

  • Easier integration into existing projects
  • Smaller footprint
  • Focused functionality
  • Simple installation process

Information

VoucherVision is designed to transcribe museum specimen labels. Please see the VoucherVision Github for more information.

As of March 2025, the University of Michigan is allowing free access to VoucherVision. The API is hosted on-demand. It takes about 1 minute for the server to wake up, then subsequent calls are much faster. The API is parallelized and scalable, making this inference much faster than the regular VoucherVision deployment. The tradeoff is that you have less control over the transcription methods. VoucherVisionGO supports Google's LLM APIs for OCR and for parsing the unformatted text into JSON.

Available LLM Models

Note: Starting April 1, 2026, only the models marked Supported below will be available. Deprecated models may stop working before that date.

Model Status Notes
gemini-1.5-pro ⛔ Deprecated No longer supported
gemini-2.0-flash ⛔ Deprecated No longer supported
gemini-2.5-flash ⛔ Deprecated No longer supported
gemini-2.5-pro ⛔ Deprecated No longer supported
gemini-3-pro-preview ⛔ Deprecated No longer supported
gemini-3.1-flash-lite-preview ✅ Supported Fast, unlimited usage; default
gemini-3-flash-preview ✅ Supported Fast with good quality
gemini-3.1-pro-preview ✅ Supported Highest quality; subject to rate limits

For the most up-to-date list of supported models, refer to the Google AI Gemini API documentation

If you want pure speed, use only "flash" models for both tasks.

If you want to transcribe different fields, reach out and I can help you develop a prompt or upload your existing prompt to make it available on the API.

Requirements

  • Python 3.10 or higher
  • External dependencies (see installation options below)

Authentication

To use the API you need to apply for an authorization token. Go to the login page and submit your info. Copy the token and store it in a safe location. Never put the token directly into your code. Always use environment variables or secrets.

Installation

Choose one of the following installation methods:

Option 1: Install in your own Python environment from the PyPi repo

Install

pip install vouchervision-go-client[full]

Upgrade

pip install --upgrade vouchervision-go-client[full]

Note: You may need to install these packages too:

pip install requests pandas termcolor tabulate tqdm

Option 2: Using pip (Install from source locally)

# Clone
git clone https://github.com/Gene-Weaver/VoucherVisionGO-client.git
cd VoucherVisionGO-client
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Option 3: Using conda (Install from source locally)

# Clone
git clone https://github.com/Gene-Weaver/VoucherVisionGO-client.git
cd VoucherVisionGO-client
# Create a virtual environment
conda create -n vvgo-client python=3.10
conda activate vvgo-client

# Install dependencies
pip install -r requirements.txt

Usage Guide (Option 1)

Programmatic Usage

You can also use the client functions in your own Python code. Install VoucherVisionGO-client from PyPi:

import os
from VoucherVision import process_vouchers

if __name__ == '__main__':
  auth_token = os.environ.get("your_auth_token") # Add auth token as an environment variable or secret

  process_vouchers(
    server="https://vouchervision-go-738307415303.us-central1.run.app/", 
    output_dir="./output", 
    prompt="SLTPvM_default_chromosome.yaml", 
    image="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg", 
    llm_model="gemini-2.5-pro",  # Specify the LLM model
    directory=None, 
    file_list=None, 
    verbose=True, 
    save_to_xlsx=True, 
    max_workers=4,
    auth_token=auth_token)  

  process_vouchers(
    server="https://vouchervision-go-738307415303.us-central1.run.app/", 
    output_dir="./output2", 
    prompt="SLTPvM_default_chromosome.yaml", 
    image=None, 
    llm_model=None, # Use the default LLM
    directory="D:/Dropbox/VoucherVisionGO/demo/images", 
    file_list=None, 
    verbose=True, 
    save_to_xlsx=True, 
    max_workers=4,
    auth_token=auth_token)  

To get the JSON packet for a single specimen record:

import os
from client import process_image, ordereddict_to_json, get_output_filename

if __name__ == '__main__':
  auth_token = os.environ.get("your_auth_token") # Add auth token as an environment variable or secret

  image_path = "https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg"
  output_dir = "./output"
  output_file, _ = get_output_filename(image_path, output_dir)  # returns (json_path, md_path)
  fname = os.path.basename(output_file).split(".")[0]

  result = process_image(fname=fname,
    server_url="https://vouchervision-go-738307415303.us-central1.run.app/", 
    image_path=image_path, 
    output_dir=output_dir, 
    verbose=True, 
    engines=["gemini-2.0-flash"],
    prompt="SLTPvM_default_chromosome.yaml",
    auth_token=auth_token)

  # Convert to JSON string
  output_str = ordereddict_to_json(result, output_type="json")
  print(output_str)

  # Or keep it as a python dict
  output_dict = ordereddict_to_json(result, output_type="dict")
  print(output_dict)

Processing Images from URLs Programmatically

Use process_vouchers_urls when your images are hosted online and you want to process them by URL rather than downloading them first:

import os
from VoucherVision import process_vouchers_urls

if __name__ == '__main__':
  auth_token = os.environ.get("your_auth_token")

  # Process a single image URL
  process_vouchers_urls(
    server="https://vouchervision-go-738307415303.us-central1.run.app/",
    output_dir="./output_urls",
    image_url="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg",
    prompt="SLTPvM_default.yaml",
    llm_model="gemini-2.0-flash",
    verbose=True,
    save_to_xlsx=True,
    auth_token=auth_token)

  # Process a list of image URLs from a file (txt, csv, or xlsx — one URL per line/row)
  process_vouchers_urls(
    server="https://vouchervision-go-738307415303.us-central1.run.app/",
    output_dir="./output_urls_bulk",
    url_list="./demo/txt/url_list.txt",
    prompt="SLTPvM_default.yaml",
    llm_model="gemini-2.0-flash",
    verbose=False,
    save_to_xlsx=True,
    max_workers=8,
    auth_token=auth_token)

Viewing prompts from the command line if you install using PyPi

To see an overview of available prompts:

vv-prompts --server https://vouchervision-go-738307415303.us-central1.run.app/ --view --auth-token "your_auth_token"

To see the entire chosen prompt:

vv-prompts --server https://vouchervision-go-738307415303.us-central1.run.app/ --prompt "SLTPvM_default.yaml" --raw --auth-token "your_auth_token"

Running VoucherVision from the command line if you install using PyPi

Process a single image

vouchervision --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg 
  --output-dir ./output 
  --prompt SLTPvM_default_chromosome.yaml 
  --verbose 
  --save-to-xlsx
  --auth-token "your_auth_token"

Process a directory of images

vouchervision --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --directory ./demo/images 
  --output-dir ./output2 
  --prompt SLTPvM_default_chromosome.yaml 
  --verbose 
  --save-to-xlsx 
  --max-workers 4
  --auth-token "your_auth_token"

Changing OCR engine

vouchervision --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg 
  --output-dir ./output3 
  --engines "gemini-2.0-flash"
  --auth-token "your_auth_token"

ONLY produce OCR text

vouchervision --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg 
  --output-dir ./output3 
  --engines "gemini-2.0-flash"
  --auth-token "your_auth_token"
  --ocr-only

Usage Guide (Options 2 & 3)

The VoucherVisionGO client provides several ways to process specimen images through the VoucherVision API. Here are the main usage patterns:

Basic Command Structure

(Don't include the '<' or '>' in the actual commands)

python VoucherVision.py --server <SERVER_URL> 
                 --output-dir <OUTPUT_DIR> 
                 --image <SINGLE_IMAGE_PATH_OR_URL> OR --directory <DIRECTORY_PATH> OR --file-list <FILE_LIST_PATH> 
                 --verbose
                 --save-to-xlsx
                 --engines <ENGINE1> <ENGINE2>
                 --prompt <PROMPT_FILE>
                 --max-workers <NUM_WORKERS>
                 --auth-token <YOUR_AUTH_TOKEN>

Required Arguments

The server url:

  • --server: URL of the VoucherVision API server

Authentication:

  • --auth-token: Your authentication token (obtained from the login page)

One of the following input options:

  • --image: Path to a single image file or URL
  • --directory: Path to a directory containing images
  • --file-list: Path to a file containing a list of image paths or URLs

The path to your local output folder:

  • --output-dir: Directory to save the output JSON results

Optional Arguments

  • --engines: OCR engine options. Recommend not including this and just use the defaults. (default: "gemini-1.5-pro gemini-2.0-flash")
  • --prompt: Custom prompt file to use. We include a few for you to use. If you created a custom prompt, submit a pull request to add it to VoucherVisionGO or reach out and I can add it for you. (default: "SLTPvM_default.yaml")
  • --verbose: Print all output to console. Turns off when processing bulk images, only available for single image calls.
  • --save-to-xlsx: Save all results to an XLSX file in the output directory. Recommended over CSV to prevent Excel from auto-converting fields like dates.
  • --max-workers: Maximum number of parallel workers. If you are processing 100s/1,000s of images increase this to 8, 16, or 32. Otherwise just skip this and let it use default values. (default: 4, max: 32)
  • --ocr-only: Run only the OCR portion of VoucherVision. This will return the same final JSON packet, but with an empty "formatted_json" field.
  • --notebook-mode: Run OCR only, skip the text label collage step, use the full image as input, and return OCR output formatted as Markdown. Useful for downstream document processing workflows.
  • --skip-label-collage: Skip the text label collage pre-processing step and send the full original image directly to OCR. Use this if your images are already cropped to the label or if the collage step produces poor results for your collection.
  • --gemini-api-key: (Optional) Provide your own Gemini API key obtained from Google AI Studio. When provided, API calls to Gemini are billed to your own Google account rather than the shared server key.
  • --include-cop90: Add Copernicus GLO-90 elevation data to results. When enabled, if decimalLatitude and decimalLongitude are present in the formatted JSON, the response will include a supplemental COP90 elevation value (in meters). This does not replace any verbatim elevation data from the label — it is purely supplemental.

View Available Prompts

View the prompts in a web GUI

List all prompts

First row linux/Mac, second row Windows

curl -H "Authorization: Bearer your_auth_token" "https://vouchervision-go-738307415303.us-central1.run.app/prompts?format=text"
(curl -H "Authorization: Bearer your_auth_token" "https://vouchervision-go-738307415303.us-central1.run.app/prompts?format=text").Content

View a specific prompt

curl -H "Authorization: Bearer your_auth_token" "https://vouchervision-go-738307415303.us-central1.run.app/prompts?prompt=SLTPvM_default.yaml&format=text"
(curl -H "Authorization: Bearer your_auth_token" "https://vouchervision-go-738307415303.us-central1.run.app/prompts?prompt=SLTPvM_default.yaml&format=text").Content

Getting a specific prompt in JSON format (default)

curl -H "Authorization: Bearer your_auth_token" "https://vouchervision-go-738307415303.us-central1.run.app/prompts?prompt=SLTPvM_default.yaml"
(curl -H "Authorization: Bearer your_auth_token" "https://vouchervision-go-738307415303.us-central1.run.app/prompts?prompt=SLTPvM_default.yaml").Content

Example Calls

Processing a Single Local Image

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/single_image" 
  --verbose
  --auth-token "your_auth_token"

Processing an Image from URL

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "https://swbiodiversity.org/imglib/h_seinet/seinet/KHD/KHD00041/KHD00041592_lg.jpg" 
  --output-dir "./results/url_image" 
  --verbose
  --auth-token "your_auth_token"

Processing All Images in a Directory

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --directory "./demo/images" 
  --output-dir "./results/multiple_images" 
  --max-workers 4
  --auth-token "your_auth_token"

Processing Images from a CSV List

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --file-list "./demo/csv/file_list.csv" 
  --output-dir "./results/from_csv" 
  --max-workers 8
  --auth-token "your_auth_token"

Processing Images from a Text File List

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --file-list "./demo/txt/file_list.txt" 
  --output-dir "./results/from_txt" 
  --auth-token "your_auth_token"

Using a Custom Prompt

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "https://swbiodiversity.org/imglib/h_seinet/seinet/KHD/KHD00041/KHD00041592_lg.jpg" 
  --output-dir "./results/custom_prompt" 
  --prompt "SLTPvM_default_chromosome.yaml" 
  --verbose
  --auth-token "your_auth_token"

Saving Results to XLSX

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --directory "./demo/images" 
  --output-dir "./results/with_xlsx" 
  --save-to-xlsx
  --auth-token "your_auth_token"

Running in OCR-only mode

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --directory "./demo/images" 
  --output-dir "./results/ocr_only" 
  --save-to-xlsx
  --auth-token "your_auth_token"
  --ocr-only

Output

The client saves the following outputs:

  • Individual JSON files for each processed image in the specified output directory.
  • A consolidated XLSX file with all results if --save-to-xlsx option is used. First column will be the local filename or filename obtained from the URL. Using XLSX is strongly recommended over CSV to prevent Excel from auto-converting fields like dates and catalog numbers.
  • Terminal output with processing details if --verbose option is used.

An example of the JSON packet returned by the VVGO API

{
  "filename": "31234100396116",
  "ocr_info": {
    "gemini-1.5-pro": {
      "ocr_text": "EASTERN KENTUCKY UNIVERSITY\nHERBARIUM\n060934\n\nKentucky\nLetcher County\nDiapensiaceae\n*Galax aphylla* auct. non L.\nAbove falls.\n\nWhitesburg Q.; Bad Branch. 1.5 miles NE\nof Eolia.\n\nR. Hannan & L. R.\nPhillippe 2022                                      May 31, 1979\n\nIK\n3 1234 10039611 6\nEastern Kentucky University Herbarium\n\n\n*Galax aphylla*\n\n",
      "cost_in": 0.00077875,
      "cost_out": 0.00062,
      "total_cost": 0.00139875,
      "rates_in": 1.25,
      "rates_out": 5.0,
      "tokens_in": 623,
      "tokens_out": 124
    },
    "gemini-2.0-flash": {
      "ocr_text": "EASTERN\nKENTUCKY\nUNIVERSITY\nHERBARIUM\n060934\nINCH\nOPTIRECTILINEAR\nU.S.A.\nKentucky\nEKY\nLetcher County\nDiapensiaceae\nGalax aphylla auct. non L.\nAbove falls.\nWhitesburg Q.; Bad Branch. 1.5 miles NE\nof Eolia.\nR. Hannnan & L. R.\nPhillippe 2022\nMay. 31, 1979\nIK\n3 1234 10039611 6\nEastern Kentucky University Herbarium\n\n\nGalax aphylla\n\n",
      "cost_in": 0.0006815,
      "cost_out": 5.68e-05,
      "total_cost": 0.0007383,
      "rates_in": 0.1,
      "rates_out": 0.4,
      "tokens_in": 6815,
      "tokens_out": 142
    }
  },
  "parsing_info": {
    "model": "gemini-2-0-flash",
    "input": 2136,
    "output": 437,
    "cost_in": 0.0002136,
    "cost_out": 0.00017480000000000002
  },
  "ocr": "\ngemini-1.5-pro OCR:\nEASTERN KENTUCKY UNIVERSITY\nHERBARIUM\n060934\n\nKentucky\nLetcher County\nDiapensiaceae\n*Galax aphylla* auct. non L.\nAbove falls.\n\nWhitesburg Q.; Bad Branch. 1.5 miles NE\nof Eolia.\n\nR. Hannan & L. R.\nPhillippe 2022                                      May 31, 1979\n\nIK\n3 1234 10039611 6\nEastern Kentucky University Herbarium\n\n\n*Galax aphylla*\n\n\ngemini-2.0-flash OCR:\nEASTERN\nKENTUCKY\nUNIVERSITY\nHERBARIUM\n060934\nINCH\nOPTIRECTILINEAR\nU.S.A.\nKentucky\nEKY\nLetcher County\nDiapensiaceae\nGalax aphylla auct. non L.\nAbove falls.\nWhitesburg Q.; Bad Branch. 1.5 miles NE\nof Eolia.\nR. Hannnan & L. R.\nPhillippe 2022\nMay. 31, 1979\nIK\n3 1234 10039611 6\nEastern Kentucky University Herbarium\n\n\nGalax aphylla\n\n",
  "formatted_json": {
    "catalogNumber": "060934",
    "scientificName": "Galax aphylla",
    "genus": "Galax",
    "specificEpithet": "aphylla",
    "scientificNameAuthorship": "auct. non L.",
    "collectedBy": "R. Hannan & L. R. Phillippe",
    "collectorNumber": "2022",
    "identifiedBy": "IK",
    "identifiedDate": "",
    "identifiedConfidence": "",
    "identifiedRemarks": "",
    "identificationHistory": "",
    "verbatimCollectionDate": "May 31, 1979",
    "collectionDate": "1979-05-31",
    "collectionDateEnd": "",
    "habitat": "Above falls.",
    "chromosomeCount": "",
    "guardCell": "",
    "specimenDescription": "",
    "cultivated": "",
    "continent": "North america",
    "country": "Usa",
    "stateProvince": "Kentucky",
    "county": "Letcher County",
    "locality": "Whitesburg Q.; Bad Branch. 1.5 miles NE of Eolia.",
    "verbatimCoordinates": "",
    "decimalLatitude": "",
    "decimalLongitude": "",
    "minimumElevationInMeters": "",
    "maximumElevationInMeters": "",
    "elevationUnits": "",
    "additionalText": "EASTERN KENTUCKY UNIVERSITY\nHERBARIUM\nEastern Kentucky University Herbarium"
  }
}

Advanced Usage

Using Different OCR Engines

Using BOTH of the best Gemini models for OCR (default)

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/custom_engines" 
  --engines "gemini-1.5-pro" "gemini-2.0-flash" 
  --verbose
  --auth-token "your_auth_token"

Using only 1 of the best Gemini models for OCR.

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/custom_engines" 
  --engines "gemini-2.0-flash" 
  --verbose
  --auth-token "your_auth_token"

Using Different LLM Models

In addition to selecting OCR engines, you can specify which LLM model to use for parsing the OCR text into structured JSON data.

From the command line

# Specify a specific LLM model for processing
python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/custom_llm" 
  --llm-model "gemini-2.5-flash" 
  --verbose
  --auth-token "your_auth_token"

From PyPi

import os
from VoucherVision import process_vouchers

auth_token = os.environ.get("your_auth_token")

process_vouchers(
  server="https://vouchervision-go-738307415303.us-central1.run.app/", 
  output_dir="./output", 
  prompt="SLTPvM_default.yaml", 
  image="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg", 
  llm_model="gemini-2.5-pro",  # Specify the LLM model
  verbose=True, 
  save_to_xlsx=True, 
  auth_token=auth_token
)

Using Your Own Gemini API Key

By default, all API calls to Gemini are made using the shared server key provided by the University of Michigan. If you have your own Gemini API key from Google AI Studio, you can supply it so that usage is billed to your own Google account. This is useful for users with high-volume needs or who want to use their own quota.

Never put your API key directly in your code. Always load it from an environment variable or a secrets manager.

From the command line

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/own_key" 
  --gemini-api-key "your_gemini_api_key"
  --verbose
  --auth-token "your_auth_token"

From PyPi

import os
from VoucherVision import process_vouchers

auth_token = os.environ.get("your_auth_token")
gemini_api_key = os.environ.get("your_gemini_api_key")

process_vouchers(
  server="https://vouchervision-go-738307415303.us-central1.run.app/", 
  output_dir="./output", 
  prompt="SLTPvM_default.yaml", 
  image="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg", 
  verbose=True, 
  save_to_xlsx=True, 
  auth_token=auth_token,
  gemini_api_key=gemini_api_key  # Optional: use your own Gemini API key
)

Single image with your own key

import os
from VoucherVision import process_image, ordereddict_to_json, get_output_filename

auth_token = os.environ.get("your_auth_token")
gemini_api_key = os.environ.get("your_gemini_api_key")

image_path = "https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg"
output_dir = "./output"
output_file, _ = get_output_filename(image_path, output_dir)
fname = os.path.basename(output_file).split(".")[0]

result = process_image(
  fname=fname,
  server_url="https://vouchervision-go-738307415303.us-central1.run.app/",
  image_path=image_path,
  output_dir=output_dir,
  verbose=True,
  engines=["gemini-2.0-flash"],
  prompt="SLTPvM_default.yaml",
  auth_token=auth_token,
  gemini_api_key=gemini_api_key  # Optional
)

Using Notebook Mode

Notebook mode runs OCR only (no JSON parsing), skips the text label collage pre-processing step, sends the full original image to the OCR model, and returns the OCR output formatted as Markdown. This is useful when you want clean, structured text output for downstream document processing, note-taking tools, or when you need to inspect raw OCR quality.

When notebook mode is enabled, the formatted_json field in the response will be empty and the OCR result will appear in the formatted_md field as Markdown. A .md file will also be saved alongside the .json file in your output directory.

From the command line

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/notebook" 
  --notebook-mode
  --verbose
  --auth-token "your_auth_token"

From PyPi

import os
from VoucherVision import process_vouchers

auth_token = os.environ.get("your_auth_token")

process_vouchers(
  server="https://vouchervision-go-738307415303.us-central1.run.app/", 
  output_dir="./output_notebook", 
  image="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg", 
  notebook_mode=True,  # Returns OCR as Markdown, skips JSON parsing
  verbose=True, 
  auth_token=auth_token
)

Skipping the Label Collage Step

By default, the server runs a pre-processing step that detects and crops label regions from the image before passing them to OCR (the "text collage"). This improves accuracy for herbarium sheet images where the specimen and labels share the same image.

Use --skip-label-collage to bypass this step and send the full original image directly to OCR. This is useful when:

  • Your images are already tightly cropped to the label
  • The collage detection is producing poor results for your collection type
  • You want faster processing and your images are clean single-label shots

From the command line

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/no_collage" 
  --skip-label-collage
  --verbose
  --auth-token "your_auth_token"

From PyPi

import os
from VoucherVision import process_vouchers

auth_token = os.environ.get("your_auth_token")

process_vouchers(
  server="https://vouchervision-go-738307415303.us-central1.run.app/", 
  output_dir="./output_no_collage", 
  image="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg", 
  skip_label_collage=True,  # Skip collage, use full image
  verbose=True, 
  save_to_xlsx=True,
  auth_token=auth_token
)

Using World Flora Online (WFO) Validation

The --include-wfo flag enables taxonomic validation against the World Flora Online database. This feature validates plant names and provides additional taxonomic information in the results.

When WFO validation is enabled, the results will include a WFO_info field containing taxonomic validation data and any corrections or additional information from the World Flora Online database.

From the Command Line (Options 2 & 3)

Single image with WFO validation:

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg" 
  --output-dir "./results/with_wfo" 
  --include-wfo 
  --verbose
  --auth-token "your_auth_token"

Directory processing with WFO validation:

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --directory "./demo/images" 
  --output-dir "./results/bulk_wfo" 
  --include-wfo 
  --max-workers 4
  --auth-token "your_auth_token"

Combining with custom prompt and LLM model:

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image "https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg" 
  --output-dir "./results/advanced_wfo" 
  --prompt "SLTPvM_default_chromosome.yaml" 
  --llm-model "gemini-2.5-pro" 
  --include-wfo 
  --verbose
  --auth-token "your_auth_token"

From PyPi (Option 1)

Command line with PyPi installation:

vouchervision --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --image https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg 
  --output-dir ./output 
  --include-wfo 
  --verbose 
  --auth-token "your_auth_token"

Programmatic usage with PyPi:

import os
from VoucherVision import process_vouchers

auth_token = os.environ.get("your_auth_token")

process_vouchers(
  server="https://vouchervision-go-738307415303.us-central1.run.app/", 
  output_dir="./output", 
  prompt="SLTPvM_default.yaml", 
  image="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg", 
  llm_model="gemini-2.5-pro",
  include_wfo=True,  # Enable WFO validation
  verbose=True, 
  save_to_xlsx=True, 
  auth_token=auth_token
)

Single image processing with WFO:

import os
from client import process_image, ordereddict_to_json, get_output_filename

auth_token = os.environ.get("your_auth_token")

image_path = "https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg"
output_dir = "./output"
output_file, _ = get_output_filename(image_path, output_dir)  # returns (json_path, md_path)
fname = os.path.basename(output_file).split(".")[0]

result = process_image(
  fname=fname,
  server_url="https://vouchervision-go-738307415303.us-central1.run.app/", 
  image_path=image_path, 
  output_dir=output_dir, 
  verbose=True, 
  engines=["gemini-2.0-flash"],
  prompt="SLTPvM_default.yaml",
  include_wfo=True,  # Enable WFO validation
  auth_token=auth_token
)

# The result will now include WFO validation data in the WFO_info field
output_dict = ordereddict_to_json(result, output_type="dict")
print("WFO Validation Results:", output_dict.get('WFO_info', 'No WFO data'))

API Usage

Using form data:

curl -X POST "https://vouchervision-go-738307415303.us-central1.run.app/process" \
  -H "Authorization: Bearer your_auth_token" \
  -F "file=@image.jpg" \
  -F "include_wfo=true"

Using URL processing:

curl -X POST "https://vouchervision-go-738307415303.us-central1.run.app/process-url" \
  -H "Authorization: Bearer your_auth_token" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/specimen.jpg",
    "include_wfo": true,
    "prompt": "SLTPvM_default.yaml"
  }'

Using Copernicus GLO-90 Elevation Data

The --include-cop90 flag enriches results with elevation data from the Copernicus GLO-90 Digital Surface Model (90 m resolution), derived from the TanDEM-X mission (DLR/Airbus) and distributed by ESA via OpenTopography.

When enabled, if decimalLatitude and decimalLongitude are present in the formatted JSON, the response will include the COP90 elevation (in meters) for those coordinates. This is supplemental data — it does not replace any verbatim elevation transcribed from the specimen label.

Contains modified Copernicus data (2011–2015). © DLR e.V. 2010–2014 and © Airbus Defence and Space GmbH 2014–2018, provided under Copernicus by the European Union and ESA.

From the Command Line (Options 2 & 3)

Single image with COP90 elevation:

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/
  --image "./demo/images/MICH_16205594_Poaceae_Jouvea_pilosa.jpg"
  --output-dir "./results/with_cop90"
  --include-cop90
  --verbose
  --auth-token "your_auth_token"

Directory processing with COP90 elevation:

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/
  --directory "./demo/images"
  --output-dir "./results/bulk_cop90"
  --include-cop90
  --max-workers 4
  --auth-token "your_auth_token"

Combining with WFO validation and COP90 elevation:

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/
  --image "https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg"
  --output-dir "./results/wfo_cop90"
  --include-wfo
  --include-cop90
  --verbose
  --auth-token "your_auth_token"

From PyPi (Option 1)

Command line with PyPi installation:

vouchervision --server https://vouchervision-go-738307415303.us-central1.run.app/
  --image https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg
  --output-dir ./output
  --include-cop90
  --verbose
  --auth-token "your_auth_token"

Programmatic usage with PyPi:

import os
from VoucherVision import process_vouchers

auth_token = os.environ.get("your_auth_token")

process_vouchers(
  server="https://vouchervision-go-738307415303.us-central1.run.app/",
  output_dir="./output",
  prompt="SLTPvM_default.yaml",
  image="https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg",
  include_cop90=True,  # Add COP90 elevation data
  verbose=True,
  save_to_xlsx=True,
  auth_token=auth_token
)

Single image processing with COP90:

import os
from client import process_image, ordereddict_to_json, get_output_filename

auth_token = os.environ.get("your_auth_token")

image_path = "https://swbiodiversity.org/imglib/seinet/sernec/EKY/31234100396/31234100396116.jpg"
output_dir = "./output"
output_file, _ = get_output_filename(image_path, output_dir)
fname = os.path.basename(output_file).split(".")[0]

result = process_image(
  fname=fname,
  server_url="https://vouchervision-go-738307415303.us-central1.run.app/",
  image_path=image_path,
  output_dir=output_dir,
  verbose=True,
  engines=["gemini-2.0-flash"],
  prompt="SLTPvM_default.yaml",
  include_cop90=True,  # Add COP90 elevation data
  auth_token=auth_token
)

output_dict = ordereddict_to_json(result, output_type="dict")
print("COP90 Elevation:", output_dict.get('COP90_info', 'No COP90 data'))

API Usage

Using form data:

curl -X POST "https://vouchervision-go-738307415303.us-central1.run.app/process" \
  -H "Authorization: Bearer your_auth_token" \
  -F "file=@image.jpg" \
  -F "use_cop90=true"

Using URL processing:

curl -X POST "https://vouchervision-go-738307415303.us-central1.run.app/process-url" \
  -H "Authorization: Bearer your_auth_token" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/specimen.jpg",
    "use_cop90": true,
    "prompt": "SLTPvM_default.yaml"
  }'

Processing Large Batches with Parallel Workers

For large datasets, you can adjust the number of parallel workers:

python VoucherVision.py --server https://vouchervision-go-738307415303.us-central1.run.app/ 
  --file-list "./demo/txt/file_list32.txt" 
  --output-dir "./results/parallel" 
  --max-workers 32 
  --save-to-xlsx
  --auth-token "your_auth_token"

Contributing

If you encounter any issues or have suggestions for improvements, please open an issue in the main repository VoucherVisionGO.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vouchervision_go_client-0.1.49.tar.gz (57.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vouchervision_go_client-0.1.49-py3-none-any.whl (44.6 kB view details)

Uploaded Python 3

File details

Details for the file vouchervision_go_client-0.1.49.tar.gz.

File metadata

File hashes

Hashes for vouchervision_go_client-0.1.49.tar.gz
Algorithm Hash digest
SHA256 daa4a183298df625947c68d5ff572e235364b47aa188731fd4f57ecfd71e0a73
MD5 452e75b16ac89015543c99c1727bddc9
BLAKE2b-256 d78fd5433a2acd7de9c38017f2ffaee7268ea6bd0346c419f9b5771a8a4dd71f

See more details on using hashes here.

File details

Details for the file vouchervision_go_client-0.1.49-py3-none-any.whl.

File metadata

File hashes

Hashes for vouchervision_go_client-0.1.49-py3-none-any.whl
Algorithm Hash digest
SHA256 83caac31e29a81d91eb92ed23199a00a3afdb786079812e77fe821d0a46e7840
MD5 46b273dad9ca3167ef9c542f15018c06
BLAKE2b-256 4f6be02a76a9a1035249f4712f1f8348b29177f7e36ecaf55c82cfe9218e681d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page