Skip to main content

An image binarization library focussing on local adaptive thresholding

Project description

DoxaPy

Introduction

DoxaPy is an image binarization library focusing on local adaptive thresholding algorithms. In English, this means that it has the ability to turn a color or gray scale image into a black and white image.

Algorithms

  • Otsu - "A threshold selection method from gray-level histograms", 1979.
  • Bernsen - "Dynamic thresholding of gray-level images", 1986.
  • Niblack - "An Introduction to Digital Image Processing", 1986.
  • Sauvola - "Adaptive document image binarization", 1999.
  • Wolf - "Extraction and Recognition of Artificial Text in Multimedia Documents", 2003.
  • Gatos - "Adaptive degraded document image binarization", 2005. (Partial)
  • NICK - "Comparison of Niblack inspired Binarization methods for ancient documents", 2009.
  • AdOtsu - "A multi-scale framework for adaptive binarization of degraded document images", 2010.
  • Su - "Binarization of Historical Document Images Using the Local Maximum and Minimum", 2010.
  • T.R. Singh - "A New local Adaptive Thresholding Technique in Binarization", 2011.
  • Bataineh - "An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows", 2011. (unreproducible)
  • ISauvola - "ISauvola: Improved Sauvola's Algorithm for Document Image Binarization", 2016.
  • WAN - "Binarization of Document Image Using Optimum Threshold Modification", 2018.

Optimizations

  • Shafait - "Efficient Implementation of Local Adaptive Thresholding Techniques Using Integral Images", 2008.
  • Petty - An algorithm for efficiently calculating the min and max of a local window. Unpublished, 2019.
  • Chan - "Memory-efficient and fast implementation of local adaptive binarization methods", 2019.
  • SIMD - SSE2, ARM NEON

Performance Metrics

  • Overall Accuracy
  • F-Measure, Precision, Recall
  • Pseudo F-Measure, Precision, Recall - "Performance Evaluation Methodology for Historical Document Image Binarization", 2013.
  • Peak Signal-To-Noise Ratio (PSNR)
  • Negative Rate Metric (NRM)
  • Matthews Correlation Coefficient (MCC)
  • Distance-Reciprocal Distortion Measure (DRDM) - "An Objective Distortion Measure for Binary Document Images Based on Human Visual Perception", 2002.

Overview

DoxaPy uses the Δoxa Binarization Framework for quickly processing python Image files. It is comprised of three major sets of algorithms: Color to Grayscale, Grayscale to Binary, and Performance Metrics. It can be used as a full DIBCO Metrics replacement that is significantly smaller, faster, and easier to integrate into existing projects.

Example

This short demo uses DoxaPy to read in a color image, converts it to binary, and then compares it to a Ground Truth image in order to calculate performance.

from PIL import Image
import numpy as np
import doxapy


def read_image(file, algorithm=doxapy.GrayscaleAlgorithms.MEAN):
    """Read an image.  If its color, use one of our many Grayscale algorithms to convert it."""
    image = Image.open(file)

    # If already in grayscale or binary, do not convert it
    if image.mode == 'L':
        return np.array(image)
    
    # Read the color image
    rgb_image = np.array(image.convert('RGB') if image.mode not in ('RGB', 'RGBA') else image)

    # Use Doxa to convert grayscale
    return doxapy.to_grayscale(algorithm, rgb_image)


# Read our target image and convert it to grayscale
grayscale_image = read_image("2JohnC1V3.png")

# Convert the grayscale image to a binary image (algorithm parameters optional)
binary_image = doxapy.to_binary(doxapy.Binarization.Algorithms.SAUVOLA, grayscale_image, {"window": 75, "k": 0.2})

# Calculate the binarization performance using a Ground Truth image
groundtruth_image = read_image("2JohnC1V3-GroundTruth.png")
performance = doxapy.calculate_performance(groundtruth_image, binary_image)
print(performance)

# Display our resulting image
Image.fromarray(binary_image).show()

DoxaPy Notebook

For more details, open the DoxaPy Notebook and to get an interactive demo.

Building and Test

DoxaPy supports 64b Linux, Windows, and Mac OSX on Python 3.x. Starting with DoxaPy 0.9.4, Python 3.12 and above are supported with full ABI compatibility. This means that new versions of DoxaPy will only be published due to feature enhancements, not Python version support.

Build from Project Root

# From the Doxa project root
git clone --depth 1 https://github.com/brandonmpetty/Doxa.git
cd Doxa
cmake --preset python
cmake --build build-python --config Release
pip install -r Bindings/Python/requirements.txt
ctest --test-dir build-python -C Release

Local Package Build

python -m build

Local Wheel Build

pip wheel . --no-deps

License

CC0 - Brandon M. Petty, 2026

To the extent possible under law, the author(s) have dedicated all copyright and related and neighboring rights to this software to the public domain worldwide. This software is distributed without any warranty.

View Online

"Freely you have received; freely give." - Matt 10:8

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doxapy-0.9.5.tar.gz (47.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

doxapy-0.9.5-cp312-abi3-win_amd64.whl (109.2 kB view details)

Uploaded CPython 3.12+Windows x86-64

doxapy-0.9.5-cp312-abi3-musllinux_1_2_x86_64.whl (581.3 kB view details)

Uploaded CPython 3.12+musllinux: musl 1.2+ x86-64

doxapy-0.9.5-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (158.9 kB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ x86-64

doxapy-0.9.5-cp312-abi3-macosx_11_0_arm64.whl (103.1 kB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

File details

Details for the file doxapy-0.9.5.tar.gz.

File metadata

  • Download URL: doxapy-0.9.5.tar.gz
  • Upload date:
  • Size: 47.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for doxapy-0.9.5.tar.gz
Algorithm Hash digest
SHA256 19a0c45e28412c6b7fd60bf5d37f708f45806db24a711fa20f64cfff61b86c3d
MD5 1375c178e4e1a9a27018366fc8882d67
BLAKE2b-256 6ff29a7841015befdb8f2f2401e58f4b8d301ecd8d5d38676a7debf5868abc65

See more details on using hashes here.

File details

Details for the file doxapy-0.9.5-cp312-abi3-win_amd64.whl.

File metadata

  • Download URL: doxapy-0.9.5-cp312-abi3-win_amd64.whl
  • Upload date:
  • Size: 109.2 kB
  • Tags: CPython 3.12+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for doxapy-0.9.5-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 6bf4c709946051a189482a20b35a1ab8778701eca7950c35a7b0ae65e98b404b
MD5 a2b8b00b1bf3a1c2504822aefe58a3ec
BLAKE2b-256 c02bda9d43f56023efd824b7eb332e4332a0dbf8e917f98c17bcaeea68ed4b0b

See more details on using hashes here.

File details

Details for the file doxapy-0.9.5-cp312-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for doxapy-0.9.5-cp312-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 c8e0ca60e559a273ccf30104f8005263636e0829f68745521d07417bf1c2779f
MD5 b0c455e22ed5e92ef466555b049d16c1
BLAKE2b-256 4166d65f0228bca0c1f02be9dadfd9a691b6825a3167e31821db595d5069b3ba

See more details on using hashes here.

File details

Details for the file doxapy-0.9.5-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for doxapy-0.9.5-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 53176521f235d5de2e3e6736ca469fa6c412cda3b52596e7519daa3d4209c325
MD5 0772daed1c298c5b097320f8b6f33e35
BLAKE2b-256 6e8fc1083e046fc257ab7caa28ae422891405e5070447e5a1ea249d3bab2c658

See more details on using hashes here.

File details

Details for the file doxapy-0.9.5-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for doxapy-0.9.5-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e31f84a764c37663ef1a0bc06d18faa6d78839211dd1487aad6c69d75c775547
MD5 f8f6cb825d723e58d16fca5e780d55ca
BLAKE2b-256 cc3d19999429c661c7cd281af989821de511e7a22c4a3d73e94a23c769b3dc5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page