Skip to main content

Document Extraction Inference API using DeepLabV3 with Pretrain Model

Project description

autocrop_kh

Automatic Document Segmentation and Cropping for Khmer IDs, Passport and Documents

Autocrop_kh is a Python package for automatic document segmentation and cropping, with a focus on Khmer IDs, Passport and other documents. It uses a DeepLabV3 model training on Khmer ID, Passport document datasets to accurately segment and extract documents from images.

License: Apache-2.0 License

Installation

Install from source

# clone repo 
git clone https://github.com/MetythornPenn/autocrop_kh.git

# install lib from source
pip install -e .

Install from PyPI

pip install autocrop-kh

Usage

Python Script

import cv2
from autocrop_kh import autocrop

# Download sample image from this url : "https://github.com/MetythornPenn/autocrop_kh/raw/main/sample/img-1.jpg"
# Model auto-download (default):
# The ONNX model is downloaded on first run from Hugging Face:
# "https://huggingface.co/metythorn/autocrop/resolve/main/autocrop_model_v2.onnx"

img_path = "sample/img-1.jpg"
model_path = None

extracted_document = autocrop(
    img_path=img_path,
    model_path=model_path,
    device='cuda:0',
    output_path="extracted_document.jpg"
)

print("Extracted document saved to extracted_document.jpg")
  • img_path: Path of the input image file.
  • model_path: Path to the pre-trained ONNX model (local path, .onnx only). If None, it auto-downloads from Hugging Face.
  • device: Specify cpu or cuda (default is cpu).
  • output_path: Optional. If set, saves the extracted image to this path.
  • AUTOCROP_KH_MODEL_DIR: Optional env var to change the download/cache directory.
  • AUTOCROP_KH_HF_REPO: Optional env var to change the Hugging Face repo (default metythorn/autocrop).

Result:

Left Image Right Image

Left Image Right Image

Noted : This model was trained with 25000 datasets include opensource data and my custom synthetic data.

Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autocrop_kh-1.1.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autocrop_kh-1.1-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file autocrop_kh-1.1.tar.gz.

File metadata

  • Download URL: autocrop_kh-1.1.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for autocrop_kh-1.1.tar.gz
Algorithm Hash digest
SHA256 b37617e30bde6607bbd9b1f78f9c23c3156fc16a898995092bf50a227da21497
MD5 a4215863fc35fc10eb00a68e5db791b8
BLAKE2b-256 5b96bf57cd079103d47bf5719e9a8368806102ae118ea2eacaded2436d1f5310

See more details on using hashes here.

File details

Details for the file autocrop_kh-1.1-py3-none-any.whl.

File metadata

  • Download URL: autocrop_kh-1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for autocrop_kh-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e21f37da87412668e01337759635290f537d0e129818f03d5ea0eaf43b94cff5
MD5 cd810ffc3da7ebc9c62567b0b0be8886
BLAKE2b-256 5eb768a13bc02346e18fe76e3164445a78160d00ced3d40849158a9d8daab3f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page