Document Scanner SDK for document edge detection, border cropping, perspective correction and brightness adjustment

These details have not been verified by PyPI

Project links

Homepage

Project description

Python Document Scanner SDK

This project provides Python bindings for the Dynamsoft C/C++ Document Scanner SDK v1.x, enabling developers to quickly create document scanner applications for Windows and Linux desktop environments.

Note: This project is an unofficial, community-maintained Python wrapper for the Dynamsoft Document Normalizer SDK. For those seeking the most reliable and fully-supported solution, Dynamsoft offers an official Python package. Visit the Dynamsoft Capture Vision Bundle page on PyPI for more details.

About Dynamsoft Capture Vision Bundle

Activate the SDK with a 30-day FREE trial license.
Install the SDK via pip install dynamsoft-capture-vision-bundle.

Comparison Table

Feature	Unofficial Wrapper (Community)	Official Dynamsoft Capture Vision SDK
Support	Community-driven, best effort	Official support from Dynamsoft
Documentation	README only	Comprehensive Online Documentation
API Coverage	Limited	Full API coverage
Feature Updates	May lag behind the official SDK	First to receive new features
Compatibility	Limited testing across environments	Thoroughly tested across all supported environments
OS Support	Windows, Linux	Windows, Linux, macOS

Supported Python Versions

Python 3.x

Dependencies

Install the required dependencies using pip:

pip install opencv-python

Command-line Usage

Scan documents from images:

scandocument -f <file-name> -l <license-key>

Scan documents from a camera video stream:
```
scandocument -c 1 -l <license-key>
```

Quick Start

Scan documents from an image file:

import argparse
import docscanner
import sys
import numpy as np
import cv2
import time

def showNormalizedImage(name, normalized_image):
    mat = docscanner.convertNormalizedImage2Mat(normalized_image)
    cv2.imshow(name, mat)
    return mat

def process_file(filename, scanner):
    image = cv2.imread(filename)
    results = scanner.detectMat(image)
    for result in results:
        x1 = result.x1
        y1 = result.y1
        x2 = result.x2
        y2 = result.y2
        x3 = result.x3
        y3 = result.y3
        x4 = result.x4
        y4 = result.y4
        
        normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
        showNormalizedImage("Normalized Image", normalized_image)
        cv2.drawContours(image, [np.intp([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)
    
    cv2.imshow('Document Image', image)
    cv2.waitKey(0)
    
    normalized_image.save(str(time.time()) + '.png')
    print('Image saved')

def scandocument():
    """
    Command-line script for scanning documents from a given image
    """
    parser = argparse.ArgumentParser(description='Scan documents from an image file')
    parser.add_argument('-f', '--file', help='Path to the image file')
    parser.add_argument('-l', '--license', default='', type=str, help='Set a valid license key')
    args = parser.parse_args()
    # print(args)
    try:
        filename = args.file
        license = args.license
        
        if filename is None:
            parser.print_help()
            return
        
        # set license
        if  license == '':
            docscanner.initLicense("LICENSE-KEY")
        else:
            docscanner.initLicense(license)
            
        # initialize mrz scanner
        scanner = docscanner.createInstance()
        ret = scanner.setParameters(docscanner.Templates.color)

        if filename is not None:
            process_file(filename, scanner)
            
    except Exception as err:
        print(err)
        sys.exit(1)

scandocument()

python document scanner from file

Scan documents from camera video stream:

import argparse
import docscanner
import sys
import numpy as np
import cv2
import time

g_results = None
g_normalized_images = []


def callback(results):
    global g_results
    g_results = results


def showNormalizedImage(name, normalized_image):
    mat = docscanner.convertNormalizedImage2Mat(normalized_image)
    cv2.imshow(name, mat)
    return mat


def process_video(scanner):
    scanner.addAsyncListener(callback)

    cap = cv2.VideoCapture(0)
    while True:
        ret, image = cap.read()

        ch = cv2.waitKey(1)
        if ch == 27:
            break
        elif ch == ord('n'):  # normalize image
            if g_results != None:
                g_normalized_images = []
                index = 0
                for result in g_results:
                    x1 = result.x1
                    y1 = result.y1
                    x2 = result.x2
                    y2 = result.y2
                    x3 = result.x3
                    y3 = result.y3
                    x4 = result.x4
                    y4 = result.y4

                    normalized_image = scanner.normalizeBuffer(
                        image, x1, y1, x2, y2, x3, y3, x4, y4)
                    g_normalized_images.append(
                        (str(index), normalized_image))
                    mat = showNormalizedImage(str(index), normalized_image)
                    index += 1
        elif ch == ord('s'):  # save image
            for data in g_normalized_images:
                # cv2.imwrite('images/' + str(time.time()) + '.png', image)
                cv2.destroyWindow(data[0])
                data[1].save(str(time.time()) + '.png')
                print('Image saved')

            g_normalized_images = []

        if image is not None:
            scanner.detectMatAsync(image)

        if g_results != None:
            for result in g_results:
                x1 = result.x1
                y1 = result.y1
                x2 = result.x2
                y2 = result.y2
                x3 = result.x3
                y3 = result.y3
                x4 = result.x4
                y4 = result.y4

                cv2.drawContours(
                    image, [np.intp([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)

        cv2.putText(image, 'Press "n" to normalize image',
                    (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
        cv2.putText(image, 'Press "s" to save image', (10, 60),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
        cv2.putText(image, 'Press "ESC" to exit', (10, 90),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
        cv2.imshow('Document Scanner', image)


docscanner.initLicense(
    "LICENSE-KEY")

scanner = docscanner.createInstance()
ret = scanner.setParameters(docscanner.Templates.color)
process_video(scanner)

python document scanner from camera

API Methods

docscanner.initLicense('YOUR-LICENSE-KEY'): Set the license key.
```
docscanner.initLicense("LICENSE-KEY")
```
docscanner.createInstance(): Create a Document Scanner instance.
```
scanner = docscanner.createInstance()
```
detectFile(filename): Perform edge detection from an image file.
```
results = scanner.detectFile(<filename>)
```

detectMat(Mat image): Perform edge detection from an OpenCV Mat.

image = cv2.imread(<filename>)
results = scanner.detectMat(image)
for result in results:
    x1 = result.x1
    y1 = result.y1
    x2 = result.x2
    y2 = result.y2
    x3 = result.x3
    y3 = result.y3
    x4 = result.x4
    y4 = result.y4

setParameters(Template): Select color, binary, or grayscale template.
```
scanner.setParameters(docscanner.Templates.color)
```
addAsyncListener(callback function): Start a native thread to run document scanning tasks asynchronously.

detectMatAsync(<opencv mat data>): Queue a document scanning task into the native thread.

def callback(results):
    for result in results:
        print(result.x1)
        print(result.y1)
        print(result.x2)
        print(result.y2)
        print(result.x3)
        print(result.y3)
        print(result.x4)
        print(result.y4)
                                                    
import cv2
image = cv2.imread(<filename>)
scanner.addAsyncListener(callback)
scanner.detectMatAsync(image)
sleep(5)

normalizeBuffer(mat, x1, y1, x2, y2, x3, y3, x4, y4): Perform perspective correction from an OpenCV Mat.
```
normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
```
normalizeFile(filename, x1, y1, x2, y2, x3, y3, x4, y4): Perform perspective correction from an image file.
```
normalized_image = scanner.normalizeFile(<filename>, x1, y1, x2, y2, x3, y3, x4, y4)
```
normalized_image.save(filename): Save the normalized image to a file.
```
normalized_image.save(<filename>)
```
normalized_image.recycle(): Release the memory of the normalized image.
clearAsyncListener(): Stop the native thread and clear the registered Python function.

How to Build the Python Document Scanner Extension

Create a source distribution:
```
python setup.py sdist
```

setuptools:

python setup_setuptools.py build
python setup_setuptools.py develop

Build wheel:

pip wheel . --verbose
# Or
python setup.py bdist_wheel

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.1

Oct 15, 2024

1.1.0

Aug 16, 2024

1.0.3

Nov 21, 2022

1.0.2

Oct 18, 2022

1.0.1

Sep 6, 2022

1.0.0

Sep 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

document-scanner-sdk-1.1.1.tar.gz (22.0 MB view details)

Uploaded Oct 15, 2024 Source

Built Distribution

document_scanner_sdk-1.1.1-cp310-cp310-win_amd64.whl (8.0 MB view details)

Uploaded Oct 15, 2024 CPython 3.10 Windows x86-64

File details

Details for the file document-scanner-sdk-1.1.1.tar.gz.

File metadata

Download URL: document-scanner-sdk-1.1.1.tar.gz
Upload date: Oct 15, 2024
Size: 22.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for document-scanner-sdk-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`9c9895577b18129abcb3d68cfbc35c25e88501d6c46eff0410fc72ade057256c`
MD5	`d81c561b1914defb6177317bd631e6b5`
BLAKE2b-256	`693a4592ed53a0dc1cee1ef95fa072255bf078135fa374eb3bf1deab2ca900c9`

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.1.1-cp310-cp310-win_amd64.whl.

File metadata

Download URL: document_scanner_sdk-1.1.1-cp310-cp310-win_amd64.whl
Upload date: Oct 15, 2024
Size: 8.0 MB
Tags: CPython 3.10, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for document_scanner_sdk-1.1.1-cp310-cp310-win_amd64.whl
Algorithm	Hash digest
SHA256	`23ba23c00107020fca5830e0a7e9d99058669ad291cd54974bcb359350a07823`
MD5	`a86ee0778e3eb2c3292b52d45b39dc3c`
BLAKE2b-256	`d9194dfb8c82f74376201f189a0193b9c5297c62557051224b85507e6772dae9`