Skip to main content
Help us improve Python packaging – donate today!

A SilverSalts Python Project

Project Description

SilverSalts project
=======================

This project aims to offer python api to access SilverSalts online services.

----

***************
Updates since last version
***************
10/22/2017 - multi-language(eng, deu, fra, spa, jpn, chi_tra, chi_sim, ita, por, nld, hin) support, and new option: oem.
10/09/2017 - added a new option: use_cache, default True. If it's True and cache exists, customer will be free of charge.

***************
API
***************

===================================================================================
ocr(spec, user, secret, host, protocol)
===================================================================================

spec: A dictionary specifying the options for the OCR process. Supported:

- data: Actual input data, usually the buffer from file read.

- input_scheme: A string representing the scheme of input data. Supported: raw

- output_scheme: A string representing the scheme of output data. Supported: hocr, pdf

- use_cache: A boolean indicating whether to use cached results. Default: True. If cache is used, no charge

- psm: an integer indicating tesseract psm value, e.g. 12

- oem: an integer indicating tesseract oem value, e.g. 3

- lang: an array of strings indicating languages, e.g. ['eng']

(the following are considered only when the output_scheme is pdf)

- text_visible: a boolean value indicating if the recognized text is visible

- orig_visible: a boolean value indicating if the original pdf is visible

- text_color: an array of 3 floats, range from 0 to 1, indicating the rgb of desired text color, e.g. [1, 0, 0], which means red

- text_color_reflects_cl: an integer value of 1 or -1, indicating if the text (if visible) color correlates to the recognition confidence level. If -1, higher confidence means brighter color; If 1, higher confidence means darker color.

user: email of the registered user

secret: secret of the registered user (available on dashboard page after registration)

host: server url, default: api.silversalts.com

protocol: http or https, default: https


============
Examples
============

from silversalts.api import ocr

with open('input.pdf', 'rb') as i:
with open('output.pdf', 'wb') as o:
spec = {
'data': i.read(),
# currently only supported value for input_scheme
'input_scheme': 'raw',
# output in pdf, or alternatively hocr
'output_scheme': 'pdf',
# use cached results (if cache is used, no charge)
'use_cache': True,
# tesseract psm value
'psm': 12,
# tesseract oem value
'oem': 3,
# language, array of language strings
'lang': ['eng'],
# the following are considered only when the output_scheme is pdf
# hide the original content so it's easier to examine the newly ocr-ed content
'orig_visible': False,
# display the ocr-ed text so we can examine the results
'text_visible': True,
# r, g, b, each ranging 0 to 1
'text_color': (1, 0.5, 1),
# 1 : the more confident, the darker
# -1 : the more confident, the brighter
'text_color_reflects_cl': 1,
}
o.write(ocr(
spec,
'you@email.com',
'your_secret_string',
# optional
'api.silversalts.com',
# optional
'https'
))

Release history Release notifications

This version
History Node

0.1.4

History Node

0.1.3

History Node

0.1.2

History Node

0.1.1

History Node

0.1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
silversalts-0.1.4.tar.gz (5.7 kB) Copy SHA256 hash SHA256 Source None Oct 23, 2017

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page