Skip to main content

Document Scanner SDK for document edge detection, border cropping, perspective correction and brightness adjustment

Project description

Python Document Scanner SDK

The project is a Python binding to Dynamsoft C/C++ Document Scanner SDK. It aims to help developers quickly build desktop document scanner applications in Python on Windows and Linux.

About Dynamsoft Document Scanner

Get a 30-day FREE trial license to activate the SDK.

Supported Python Edition

  • Python 3.x

Dependencies

pip install opencv-python

Command-line Usage

# Scan documents from images
$ scandocument -f <file-name> -l <license-key>

# Scan documents from camera video stream
$ scandocument -c 1 -l <license-key>

Quick Start

  • Scan documents from an image file:

    import argparse
    import docscanner
    import sys
    import numpy as np
    import cv2
    import time
    
    def showNormalizedImage(name, normalized_image):
        mat = docscanner.convertNormalizedImage2Mat(normalized_image)
        cv2.imshow(name, mat)
        return mat
    
    def process_file(filename, scanner):
        image = cv2.imread(filename)
        results = scanner.detectMat(image)
        for result in results:
            x1 = result.x1
            y1 = result.y1
            x2 = result.x2
            y2 = result.y2
            x3 = result.x3
            y3 = result.y3
            x4 = result.x4
            y4 = result.y4
            
            normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
            showNormalizedImage("Normalized Image", normalized_image)
            cv2.drawContours(image, [np.int0([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)
        
        cv2.imshow('Document Image', image)
        cv2.waitKey(0)
        
        normalized_image.save(str(time.time()) + '.png')
        print('Image saved')
    
    def scandocument():
        """
        Command-line script for scanning documents from a given image
        """
        parser = argparse.ArgumentParser(description='Scan documents from an image file')
        parser.add_argument('-f', '--file', help='Path to the image file')
        parser.add_argument('-l', '--license', default='', type=str, help='Set a valid license key')
        args = parser.parse_args()
        # print(args)
        try:
            filename = args.file
            license = args.license
            
            if filename is None:
                parser.print_help()
                return
            
            # set license
            if  license == '':
                docscanner.initLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==")
            else:
                docscanner.initLicense(license)
                
            # initialize mrz scanner
            scanner = docscanner.createInstance()
            ret = scanner.setParameters(docscanner.Templates.color)
    
            if filename is not None:
                process_file(filename, scanner)
                
        except Exception as err:
            print(err)
            sys.exit(1)
    
    scandocument()
    

    python document scanner from file

  • Scan documents from camera video stream:

    import argparse
    import docscanner
    import sys
    import numpy as np
    import cv2
    import time
    
    g_results = None
    g_normalized_images = []
    
    def callback(results):
        global g_results
        g_results = results
    
    def showNormalizedImage(name, normalized_image):
        mat = docscanner.convertNormalizedImage2Mat(normalized_image)
        cv2.imshow(name, mat)
        return mat
        
    def process_video(scanner):
        scanner.addAsyncListener(callback)
        
        cap = cv2.VideoCapture(0)
        while True:
            ret, image = cap.read()
            
            ch = cv2.waitKey(1)
            if ch == 27:
                break
            elif ch == ord('n'): # normalize image
                if g_results != None:
                    g_normalized_images = []
                    index = 0
                    for result in g_results:
                        x1 = result.x1
                        y1 = result.y1
                        x2 = result.x2
                        y2 = result.y2
                        x3 = result.x3
                        y3 = result.y3
                        x4 = result.x4
                        y4 = result.y4
                        
                        normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
                        g_normalized_images.append((str(index), normalized_image))
                        mat = showNormalizedImage(str(index), normalized_image)
                        index += 1
            elif ch == ord('s'): # save image
                for data in g_normalized_images:
                    # cv2.imwrite('images/' + str(time.time()) + '.png', image)
                    cv2.destroyWindow(data[0])
                    data[1].save(str(time.time()) + '.png')
                    print('Image saved')
                    
                g_normalized_images = []
                
            if image is not None:
                scanner.detectMatAsync(image)
            
            if g_results != None:
                for result in g_results:
                    x1 = result.x1
                    y1 = result.y1
                    x2 = result.x2
                    y2 = result.y2
                    x3 = result.x3
                    y3 = result.y3
                    x4 = result.x4
                    y4 = result.y4
                    
                    cv2.drawContours(image, [np.int0([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)
                
            cv2.putText(image, 'Press "n" to normalize image', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
            cv2.putText(image, 'Press "s" to save image', (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
            cv2.putText(image, 'Press "ESC" to exit', (10, 90), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
            cv2.imshow('Document Scanner', image)
    
    def scandocument():
        """
        Command-line script for scanning documents from camera video stream.
        """
        parser = argparse.ArgumentParser(description='Scan documents from camera')
        parser.add_argument('-c', '--camera', default=False, type=bool, help='Whether to show the image')
        parser.add_argument('-l', '--license', default='', type=str, help='Set a valid license key')
        args = parser.parse_args()
        # print(args)
        try:
            license = args.license
            camera = args.camera
            
            if camera is False:
                parser.print_help()
                return
            
            # set license
            if  license == '':
                docscanner.initLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==")
            else:
                docscanner.initLicense(license)
                
            # initialize mrz scanner
            scanner = docscanner.createInstance()
            ret = scanner.setParameters(docscanner.Templates.color)
    
            if camera is True:
                process_video(scanner)
                
        except Exception as err:
            print(err)
            sys.exit(1)
    
    scandocument()
    

    python document scanner from camera

Methods

  • docscanner.initLicense('YOUR-LICENSE-KEY') # set the license key

    docscanner.initLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==")
    
  • docscanner.createInstance() # create a Document Scanner instance

    scanner = docscanner.createInstance()
    
  • detectFile(filename) # do edge detection from an image file

    results = scanner.detectFile(<filename>)
    
  • detectMat(Mat image) # do edge detection from Mat

    image = cv2.imread(<filename>)
    results = scanner.detectMat(image)
    for result in results:
        x1 = result.x1
        y1 = result.y1
        x2 = result.x2
        y2 = result.y2
        x3 = result.x3
        y3 = result.y3
        x4 = result.x4
        y4 = result.y4
    
  • setParameters(Template) # Select color, binary or grayscale template

    scanner.setParameters(docscanner.Templates.color)
    
  • addAsyncListener(callback function) # start a native thread to run document scanning tasks

  • detectMatAsync(<opencv mat data>) # put a document scanning task into the native queue

    def callback(results):
        for result in results:
            print(result.x1)
            print(result.y1)
            print(result.x2)
            print(result.y2)
            print(result.x3)
            print(result.y3)
            print(result.x4)
            print(result.y4)
                                                        
    import cv2
    image = cv2.imread(<filename>)
    scanner.addAsyncListener(callback)
    scanner.detectMatAsync(image)
    sleep(5)
    
  • normalizeBuffer(mat, x1, y1, x2, y2, x3, y3, x4, y4) # do perspective correction from Mat

    normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
    
  • normalizeFile(filename, x1, y1, x2, y2, x3, y3, x4, y4) # do perspective correction from a file

    normalized_image = scanner.normalizeFile(<filename>, x1, y1, x2, y2, x3, y3, x4, y4)
    
  • normalized_image.save(filename) # save the normalized image to a file

    normalized_image.save(<filename>)
    
  • normalized_image.recycle() # release the memory of the normalized image

C/C++ API

To customize Python API based on C/C++, please refer to the online documentation.

How to Build the Python Document Scanner Extension

  • Create a source distribution:

    python setup.py sdist
    
  • setuptools:

    python setup_setuptools.py build
    python setup_setuptools.py develop 
    
  • Build wheel:

    pip wheel . --verbose
    # Or
    python setup.py bdist_wheel
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

document-scanner-sdk-1.0.0.tar.gz (10.9 MB view details)

Uploaded Source

Built Distributions

document_scanner_sdk-1.0.0-cp310-cp310-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.10 Windows x86-64

document_scanner_sdk-1.0.0-cp310-cp310-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.0-cp39-cp39-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

document_scanner_sdk-1.0.0-cp39-cp39-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.0-cp38-cp38-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

document_scanner_sdk-1.0.0-cp38-cp38-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.0-cp37-cp37m-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

document_scanner_sdk-1.0.0-cp37-cp37m-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.0-cp36-cp36m-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

document_scanner_sdk-1.0.0-cp36-cp36m-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.24+ x86-64

File details

Details for the file document-scanner-sdk-1.0.0.tar.gz.

File metadata

  • Download URL: document-scanner-sdk-1.0.0.tar.gz
  • Upload date:
  • Size: 10.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for document-scanner-sdk-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f750a82749e7b3a55394b43bf351d3dfa1bdb9e90d32ce1a2287ad323746a226
MD5 115ec87976bff28170ddf60a1bdc73a2
BLAKE2b-256 9625ad4a846a3c1890cc177c3033935fd48c0d7f6f648887c7fb91b14aaf3445

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 baae2e01b906fb2bc35b9401762a04797239a39b2c5ce86e592c24ef2e996dd7
MD5 89a0ef4c32420e78dc8ac843eb5af6fe
BLAKE2b-256 c4f50958d05c4dedc504c7ee0d374c782d549b27fb74f655aa81c832e86c079e

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp310-cp310-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp310-cp310-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 6ff3f8049f55a45ea2a437c35930501b23c9df931012d8315d65964ea5c458a2
MD5 f61d329f35754676eaacbb09aabaafbc
BLAKE2b-256 deeb1b548fec8e80922861635ac8fd6fbf994c16473dc245f8f4891a41063d8c

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 58a447ca34d13eb8e87cd479ef8cb62630b13dea635c5edf46449307a30763bc
MD5 5d545a4d2499378ff778b9a830db4a8b
BLAKE2b-256 e0a9f456fe33ac92bf8b9bc94e8cef7055bf42be139d2a60a13e34da56549eb7

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp39-cp39-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp39-cp39-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 145de4991857ef93a4fc0405b76eb39cfbd8d0079e807d2078d7c722ddc89be5
MD5 427c8b0f5b42a813aca9123652960d9d
BLAKE2b-256 7e5da60092bb9c77cbfae5ed2ede31373d72a3228257bec30a35173efe7afe42

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 0855dd197b07fb841e0acfd149cd05f176bd8183c64520b1e2b6b0892e5c95b4
MD5 a6e3f3f5c62ec8752ff9932729a85a29
BLAKE2b-256 340554580a00e1c2968f11890d30f3be0ca3672fb10d4b32591706a58f6b20d5

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp38-cp38-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp38-cp38-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 08b882a57c4c27fad4f2117e3f7a982729b2eb6b65785b4182a65297af54d233
MD5 c70e9e381b43d57c61362589ad9b2625
BLAKE2b-256 a6f53dfbbbfb84f0037a0e374509f348daa30178690444d7d67b79139db46872

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 449e3aafbf49c615a5d106812cf537b09426fa6269d92e6f5e07a3c1e0e85270
MD5 ba34ebbee3064981e40bdcabc448fda8
BLAKE2b-256 58069db0eefd92d129ec6429e51bae7f2034a6e6d443f260836426f2d6dda3a1

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp37-cp37m-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp37-cp37m-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 416eaeb7966bd5cf6ca4e6b08dd1ec1888197c63f9cfcfb9fe95b70f4e150c69
MD5 b4bcbce3e62468f7215a1fb906567e18
BLAKE2b-256 852b8b15046bc01d05ca8667428b29d3f50eae440e0135af577fdcb77e6c10d5

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 8ae5cc651f83517f6012dc64897f9ab4adf70e80f25c73cb79d67b8746420ae7
MD5 abb36c3ec9d0f10a05c698ca316ae4d5
BLAKE2b-256 d14eade7efa90da457854950e327e500f04dc0fb5b4ca50f2af074be266ef339

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.0-cp36-cp36m-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.0-cp36-cp36m-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 56411016f9cdb5064d62292191fca5451abc3f76aa7420514239755e537f9f68
MD5 bd3eb9aa2a548b8e50973eff32a180b9
BLAKE2b-256 f64e73c144934d5ef1f27aaa4fb114c18aebd7302f1526208d7efb740bb6a8cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page