Skip to main content

Python extraction script for HVF report images

Project description

HVF Extraction Script

Python module for Humphrey Visual Field (HVF) report data extraction. Extracts data using OCR (tesseract) and image processing techniques (heavily reliant on openCV) to extract data into an object oriented format for further processing.

Getting Started

Requirements

  • Python 3.6.7 or higher
  • TesserOCR
  • Regex
  • Pillow
  • OpenCV 4.2.0
  • FuzzyWuzzy
  • Fuzzysearch

Installation

Usage

Data is managed primarily through the Hvf_Object class, which contains the report metadata (name/ID, test date, field size and strategy, etc), and the 5 data plots (raw sensitivity, total deviation value/percentile plots, and pattern deviation value/percentile plots). Plot data is stored as Hvf_Plot_Array objects (internally as 10x10 Numpy arrays), and individual plot data elements are stored as either Hvf_Value or Hvf_Perc_Icon objects.

Importing and exporting data

Processing a single image:

from hvf_extraction_script import Hvf_Object
from hvf_extraction_script import File_Utils

hvf_img = File_Utils.read_image_from_file(hvf_img_path);
hvf_obj = Hvf_Object.get_hvf_object_from_image(hvf_img);

Saving as a text file:

serialized_string = hvf_obj.serialize_to_json();
txt_file_path = “path/to/target/file/to/write”;
File_Utils.write_string_to_file(serialized_string, target_file_path)

Reinstantiating from text file

hvf_txt = File_Utils.read_text_from_file(txt_file_path);
hvf_obj = Hvf_Object.get_hvf_object_from_text(hvf_txt);

Export to spreadsheet (tab-separated values):

# Takes in a dictionary of filename_string -> hvf_obj
from hvf_extraction_script import Hvf_Export;

dict_of_hvf_objs = {“file1.PNG”: hvf_obj1, “file2.PNG”: hvf_obj2, “file3.PNG”: hvf_obj3 };
spreadsheet_string = Hvf_Export.export_hvf_list_to_spreadsheet(dict_of_hvf_objs)
File_Utils.write_string_to_file(return_string, "output_spreadsheet.tsv")

Basic data usage: Structure of hvf_obj and underlying objects

Running Unit Tests

Single Image Testing:

Running a single image test performs an extraction of an image report, shows its extraction data in pretty-print, and tests serialization/deserialization procedures

from hvf_extraction_script import Hvf_Test
from hvf_extraction_script import File_Utils

image_path = “path/to/image/file.PNG”;
hvf_image = File_Utils.read_image_from_file(image_path);
Hvf_Test.test_single_image(hvf_image);

Unit Testing:

The module comes with the ability to run unit tests, but with no pre-loaded unit tests to run. Unit tests are organized into collections under a specified name; they compare data extracted from images against a reference text file . When a unit test image is ‘added’, the module (in its current state) generates the reference file purely from the extraction; the user must then go and manually edit/replace the text file with the corrections to validate the reference file. The image file and reference test files are stored under hvf_test_cases with corresponding names.

Adding unit tests:

image_path = “path/to/image/file.PNG”;
unit_test_name = “unit_test_name”
Hvf_Test.add_unit_test(image_path, unit_test_name)

# Then, manually correct reference text file under hvf_test_cases

Running unit tests:

Hvf_Test.test_unit_tests(unit_test_name)
...
[SYSTEM] ================================================================================
[SYSTEM] Starting test: v1_30
[SYSTEM] Test v1_30: FAILED ==============================
[SYSTEM] - Metadata: FULL MATCH
[SYSTEM] - Raw Value Plot MISMATCH COUNT: 1
[SYSTEM] --> Element (5, 2) - expected 24, actual 21
[SYSTEM] - Total Deviation Value Plot: FULL MATCH
[SYSTEM] - Pattern Deviation Value Plot: FULL MATCH
[SYSTEM] - Total Deviation Percentile Plot: FULL MATCH
[SYSTEM] - Pattern Deviation Percentile Plot: FULL MATCH
[SYSTEM] END Test v1_30 FAILURE REPORT =====================
[SYSTEM] ================================================================================
[SYSTEM] UNIT TEST AGGREGATE METRICS:
[SYSTEM] Total number of tests: 30
[SYSTEM] Average extraction time per report: 5868ms
[SYSTEM]
[SYSTEM] Total number of metadata fields: 510
[SYSTEM] Total number of metadata field errors: 16
[SYSTEM] Metadata field error rate: 0.031
[SYSTEM]
[SYSTEM] Total number of value data points: 5047
[SYSTEM] Total number of value data point errors: 44
[SYSTEM] Value data point error rate: 0.009
[SYSTEM]
[SYSTEM] Total number of percentile data points: 3453
[SYSTEM] Total number of percentile data point errors: 0
[SYSTEM] Percentile data point error rate: 0.0

Authors

  • Murtaza Saifee, MD - Ophthalmology resident, UCSF

Validation

In progress

License

GPL License

Using/Contributing

This project was developed in the spirit of facilitating vision research. To that end, we encourage all to download, use, critique and improve upon the project. Collaboration requests are also welcomed.

Acknowledgements

  • PyImageSearch for excellent tutorials on image processing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hvf_extraction_script-0.0.1.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

hvf_extraction_script-0.0.1-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file hvf_extraction_script-0.0.1.tar.gz.

File metadata

  • Download URL: hvf_extraction_script-0.0.1.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for hvf_extraction_script-0.0.1.tar.gz
Algorithm Hash digest
SHA256 f36fab87a31d6219901b44680cb1c8fc421c806a809253df52386f9b8a03feef
MD5 2717368ddcab5f623938b8e95d51ab80
BLAKE2b-256 2fa161b06e2d213b31f182b398e414b98a3232757ddfd36df5afeb2a67c24306

See more details on using hashes here.

File details

Details for the file hvf_extraction_script-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: hvf_extraction_script-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for hvf_extraction_script-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0518bc48b0ea9199c9c643e37b53ddbf95a64d48ac19b36ac7aa156988888d18
MD5 208df29f8c714fe590b6c16204ad0709
BLAKE2b-256 d00d0d1c08a8b31bb0774e11fccae2047bb832d1eca57eac7faba0fe5d1424da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page