Skip to main content

Document Scanner SDK for document edge detection, border cropping, perspective correction and brightness adjustment

Project description

Python Document Scanner SDK

The project is a Python binding to Dynamsoft C/C++ Document Scanner SDK. It aims to help developers quickly build desktop document scanner applications in Python on Windows and Linux.

About Dynamsoft Document Scanner

Get a 30-day FREE trial license to activate the SDK.

Supported Python Edition

  • Python 3.x

Dependencies

pip install opencv-python

Command-line Usage

# Scan documents from images
$ scandocument -f <file-name> -l <license-key>

# Scan documents from camera video stream
$ scandocument -c 1 -l <license-key>

Quick Start

  • Scan documents from an image file:

    import argparse
    import docscanner
    import sys
    import numpy as np
    import cv2
    import time
    
    def showNormalizedImage(name, normalized_image):
        mat = docscanner.convertNormalizedImage2Mat(normalized_image)
        cv2.imshow(name, mat)
        return mat
    
    def process_file(filename, scanner):
        image = cv2.imread(filename)
        results = scanner.detectMat(image)
        for result in results:
            x1 = result.x1
            y1 = result.y1
            x2 = result.x2
            y2 = result.y2
            x3 = result.x3
            y3 = result.y3
            x4 = result.x4
            y4 = result.y4
            
            normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
            showNormalizedImage("Normalized Image", normalized_image)
            cv2.drawContours(image, [np.int0([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)
        
        cv2.imshow('Document Image', image)
        cv2.waitKey(0)
        
        normalized_image.save(str(time.time()) + '.png')
        print('Image saved')
    
    def scandocument():
        """
        Command-line script for scanning documents from a given image
        """
        parser = argparse.ArgumentParser(description='Scan documents from an image file')
        parser.add_argument('-f', '--file', help='Path to the image file')
        parser.add_argument('-l', '--license', default='', type=str, help='Set a valid license key')
        args = parser.parse_args()
        # print(args)
        try:
            filename = args.file
            license = args.license
            
            if filename is None:
                parser.print_help()
                return
            
            # set license
            if  license == '':
                docscanner.initLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==")
            else:
                docscanner.initLicense(license)
                
            # initialize mrz scanner
            scanner = docscanner.createInstance()
            ret = scanner.setParameters(docscanner.Templates.color)
    
            if filename is not None:
                process_file(filename, scanner)
                
        except Exception as err:
            print(err)
            sys.exit(1)
    
    scandocument()
    

    python document scanner from file

  • Scan documents from camera video stream:

    import argparse
    import docscanner
    import sys
    import numpy as np
    import cv2
    import time
    
    g_results = None
    g_normalized_images = []
    
    def callback(results):
        global g_results
        g_results = results
    
    def showNormalizedImage(name, normalized_image):
        mat = docscanner.convertNormalizedImage2Mat(normalized_image)
        cv2.imshow(name, mat)
        return mat
        
    def process_video(scanner):
        scanner.addAsyncListener(callback)
        
        cap = cv2.VideoCapture(0)
        while True:
            ret, image = cap.read()
            
            ch = cv2.waitKey(1)
            if ch == 27:
                break
            elif ch == ord('n'): # normalize image
                if g_results != None:
                    g_normalized_images = []
                    index = 0
                    for result in g_results:
                        x1 = result.x1
                        y1 = result.y1
                        x2 = result.x2
                        y2 = result.y2
                        x3 = result.x3
                        y3 = result.y3
                        x4 = result.x4
                        y4 = result.y4
                        
                        normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
                        g_normalized_images.append((str(index), normalized_image))
                        mat = showNormalizedImage(str(index), normalized_image)
                        index += 1
            elif ch == ord('s'): # save image
                for data in g_normalized_images:
                    # cv2.imwrite('images/' + str(time.time()) + '.png', image)
                    cv2.destroyWindow(data[0])
                    data[1].save(str(time.time()) + '.png')
                    print('Image saved')
                    
                g_normalized_images = []
                
            if image is not None:
                scanner.detectMatAsync(image)
            
            if g_results != None:
                for result in g_results:
                    x1 = result.x1
                    y1 = result.y1
                    x2 = result.x2
                    y2 = result.y2
                    x3 = result.x3
                    y3 = result.y3
                    x4 = result.x4
                    y4 = result.y4
                    
                    cv2.drawContours(image, [np.int0([(x1, y1), (x2, y2), (x3, y3), (x4, y4)])], 0, (0, 255, 0), 2)
                
            cv2.putText(image, 'Press "n" to normalize image', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
            cv2.putText(image, 'Press "s" to save image', (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
            cv2.putText(image, 'Press "ESC" to exit', (10, 90), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2)
            cv2.imshow('Document Scanner', image)
    
    def scandocument():
        """
        Command-line script for scanning documents from camera video stream.
        """
        parser = argparse.ArgumentParser(description='Scan documents from camera')
        parser.add_argument('-c', '--camera', default=False, type=bool, help='Whether to show the image')
        parser.add_argument('-l', '--license', default='', type=str, help='Set a valid license key')
        args = parser.parse_args()
        # print(args)
        try:
            license = args.license
            camera = args.camera
            
            if camera is False:
                parser.print_help()
                return
            
            # set license
            if  license == '':
                docscanner.initLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==")
            else:
                docscanner.initLicense(license)
                
            # initialize mrz scanner
            scanner = docscanner.createInstance()
            ret = scanner.setParameters(docscanner.Templates.color)
    
            if camera is True:
                process_video(scanner)
                
        except Exception as err:
            print(err)
            sys.exit(1)
    
    scandocument()
    

    python document scanner from camera

Methods

  • docscanner.initLicense('YOUR-LICENSE-KEY') # set the license key

    docscanner.initLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==")
    
  • docscanner.createInstance() # create a Document Scanner instance

    scanner = docscanner.createInstance()
    
  • detectFile(filename) # do edge detection from an image file

    results = scanner.detectFile(<filename>)
    
  • detectMat(Mat image) # do edge detection from Mat

    image = cv2.imread(<filename>)
    results = scanner.detectMat(image)
    for result in results:
        x1 = result.x1
        y1 = result.y1
        x2 = result.x2
        y2 = result.y2
        x3 = result.x3
        y3 = result.y3
        x4 = result.x4
        y4 = result.y4
    
  • setParameters(Template) # Select color, binary or grayscale template

    scanner.setParameters(docscanner.Templates.color)
    
  • addAsyncListener(callback function) # start a native thread to run document scanning tasks

  • detectMatAsync(<opencv mat data>) # put a document scanning task into the native queue

    def callback(results):
        for result in results:
            print(result.x1)
            print(result.y1)
            print(result.x2)
            print(result.y2)
            print(result.x3)
            print(result.y3)
            print(result.x4)
            print(result.y4)
                                                        
    import cv2
    image = cv2.imread(<filename>)
    scanner.addAsyncListener(callback)
    scanner.detectMatAsync(image)
    sleep(5)
    
  • normalizeBuffer(mat, x1, y1, x2, y2, x3, y3, x4, y4) # do perspective correction from Mat

    normalized_image = scanner.normalizeBuffer(image, x1, y1, x2, y2, x3, y3, x4, y4)
    
  • normalizeFile(filename, x1, y1, x2, y2, x3, y3, x4, y4) # do perspective correction from a file

    normalized_image = scanner.normalizeFile(<filename>, x1, y1, x2, y2, x3, y3, x4, y4)
    
  • normalized_image.save(filename) # save the normalized image to a file

    normalized_image.save(<filename>)
    
  • normalized_image.recycle() # release the memory of the normalized image

  • clearAsyncListener() # stop the native thread and clear the registered Python function

C/C++ API

To customize Python API based on C/C++, please refer to the online documentation.

How to Build the Python Document Scanner Extension

  • Create a source distribution:

    python setup.py sdist
    
  • setuptools:

    python setup_setuptools.py build
    python setup_setuptools.py develop 
    
  • Build wheel:

    pip wheel . --verbose
    # Or
    python setup.py bdist_wheel
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

document-scanner-sdk-1.0.3.tar.gz (10.9 MB view details)

Uploaded Source

Built Distributions

document_scanner_sdk-1.0.3-cp310-cp310-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.10 Windows x86-64

document_scanner_sdk-1.0.3-cp310-cp310-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.3-cp39-cp39-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

document_scanner_sdk-1.0.3-cp39-cp39-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.3-cp38-cp38-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

document_scanner_sdk-1.0.3-cp38-cp38-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.3-cp37-cp37m-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

document_scanner_sdk-1.0.3-cp37-cp37m-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.24+ x86-64

document_scanner_sdk-1.0.3-cp36-cp36m-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

document_scanner_sdk-1.0.3-cp36-cp36m-manylinux_2_24_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.24+ x86-64

File details

Details for the file document-scanner-sdk-1.0.3.tar.gz.

File metadata

  • Download URL: document-scanner-sdk-1.0.3.tar.gz
  • Upload date:
  • Size: 10.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for document-scanner-sdk-1.0.3.tar.gz
Algorithm Hash digest
SHA256 ed86505250aa30e691ccebd70062479b1e345de4dbf1956d7f8f75a1ac7bfe1b
MD5 0bde7f79f474ad5d47ffef01ba160342
BLAKE2b-256 211b444fa39977fb1df3cf297787ad12f3abb4ea7e5e9592b6733f50a573e8e4

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b3ae0f5443debd750cb9627b15a5dce9381b5389b0bc6d2e8ca762afcfc1293d
MD5 0b66dce5a6ef143bfc9574f49720bf9a
BLAKE2b-256 cb62465f949cb76991091c48fcb02665d8336d13b72eb8328bfde205868d1352

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp310-cp310-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp310-cp310-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b103f46442aa6621ce01323f37553ceb5d7b1addcb4bf6c1b7d3f3775414e3dc
MD5 7ec1af2a84bcc257c370db5e18ac9c52
BLAKE2b-256 ca0a96c4042c7f6e74b9dd6a2b2cf88962975268dc4f52bc931002b905f3ba0d

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 358744433d1678f8da0811dc8d7447d8e5404e05ad65ff15ce8021e48d5759b0
MD5 f19df4021b92decacec2e5571816b56e
BLAKE2b-256 d8eed5df0092295f3b6fdf94890f6a67ee50be19542debbfeb90f2eef14969a8

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp39-cp39-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp39-cp39-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 7122a15c335d6a078f818c409583decde5e1b914455c92b435adc205ce3a9456
MD5 18ac7acb3141fafb6977bee817c3f2b2
BLAKE2b-256 5268e1cce64fa9b50bdf331a9c99524bccb4703eefd3fe0b4bcb555dd2f327e2

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 d810e691b8d1a8736db9898c81307e7aeebf936c66471346ae3903556ee87101
MD5 a020ab79f07f0b13413c7ff96eadbb84
BLAKE2b-256 2068c6471cc828af7d8f603ae2c5029cbe8b7b28c5f4dfc622f8fd7b890cd5e8

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp38-cp38-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp38-cp38-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a0e7757a599eee01ae79cbd9bbc579d7915ae6e97e2fdef9f5decc649263738e
MD5 a33d7fb4e1d7e992570d7f90d5b18493
BLAKE2b-256 dc36f2e281c4a1fccef44fc38527a1a78d69a033f6e0da4fadfe7853e5561bca

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 7620f00556886f4d671be39b06b3f4b709f50df7ba3b8d857081a99158b17c65
MD5 367fcd2434b2c2d99b9269a709cd6cd0
BLAKE2b-256 2359e7c0f6a76eb289beda6eadaa3cc9254a0995872a9688554eb5fc1467aeea

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp37-cp37m-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp37-cp37m-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 372f6b2bf26ec59f79a484d79e7330999c3f0e9004f92e1d5a03779f1e4539fe
MD5 0772baba43c8afddea9ac12072ea0e82
BLAKE2b-256 1d183f45d1d3858129c2a4c0bfe03aaca79d6759e58d04bc17137fc6259c085b

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 b08daf4fc203c1a72069fa11f457010b121645812a2b98c821377d2d91248a56
MD5 091c40e63b488dc308801efc2b583746
BLAKE2b-256 fc60e542f55ba1d2dfc3409b502041a26fde7c3abcc70c5ffb34060f9d171d4c

See more details on using hashes here.

File details

Details for the file document_scanner_sdk-1.0.3-cp36-cp36m-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for document_scanner_sdk-1.0.3-cp36-cp36m-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 f57ca15e59b8906d16d0814d92f7a4a3b1531748b5653b32272df4601b10968d
MD5 eeb569286693ba8996ccaf681cf0917e
BLAKE2b-256 9fa9eecce566bd0166bbab441c1cf7402b57f382c3d2195536667c6b01d01994

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page