kraken

OCR/HTR engine for all the languages

These details have not been verified by PyPI

Project links

Homepage

Project description

Description

https://travis-ci.org/mittagessen/kraken.svg?branch=master

kraken is a turn-key OCR system optimized for historical and non-Latin script material.

kraken’s main features are:

Fully trainable layout analysis and character recognition

Right-to-Left, BiDi, and Top-to-Bottom script support

ALTO, PageXML, abbyXML, and hOCR output

Word bounding boxes and character cuts

Multi-script recognition support

Public repository of model files

Lightweight model files

Variable recognition network architectures

Installation

When using a recent version of pip all dependencies will be installed from binary wheel packages, so installing build-essential or your distributions equivalent is often unnecessary. kraken only runs on Linux or Mac OS X. Windows is not supported.

Install the latest development version through conda:

$ wget https://raw.githubusercontent.com/mittagessen/kraken/master/environment.yml
$ conda env create -f environment.yml

or:

$ wget https://raw.githubusercontent.com/mittagessen/kraken/master/environment_cuda.yml
$ conda env create -f environment_cuda.yml

for CUDA acceleration with the appropriate hardware.

It is also possible to install the latest stable release from pypi:

$ pip install kraken

Finally you’ll have to scrounge up a model to do the actual recognition of characters. To download the default model for printed English text and place it in the kraken directory for the current user:

$ kraken get 10.5281/zenodo.2577813

A list of libre models available in the central repository can be retrieved by running:

$ kraken list

Quickstart

Recognizing text on an image using the default parameters including the prerequisite steps of binarization and page segmentation:

$ kraken -i image.tif image.txt binarize segment ocr

To binarize a single image using the nlbin algorithm:

$ kraken -i image.tif bw.png binarize

To segment an image (binarized or not) with the new baseline segmenter:

$ kraken -i image.tif lines.json segment -bl

To segment and OCR an image using the default model(s):

$ kraken -i image.tif image.txt segment -bl ocr

All subcommands and options are documented. Use the help option to get more information.

Documentation

Have a look at the docs

Funding

kraken is developed at the École Pratique des Hautes Études, Université PSL.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

7.0.0b6 pre-release

Mar 4, 2026

7.0.0b5 pre-release

Feb 25, 2026

7.0.0b4 pre-release

Feb 23, 2026

7.0.0b3 pre-release

Feb 16, 2026

7.0.0.0b2 pre-release

Feb 14, 2026

7.0.0.0b1 pre-release

Feb 13, 2026

6.0.3

Dec 13, 2025

6.0.2

Sep 25, 2025

6.0.0

Aug 29, 2025

5.3.0

Nov 21, 2024

5.2.9

Aug 27, 2024

5.2.8

Jul 24, 2024

5.2.7

Jul 8, 2024

5.2.6

Jul 3, 2024

5.2.5

May 23, 2024

5.2.4

May 9, 2024

5.2.3

May 5, 2024

5.2.2

Apr 30, 2024

5.2.1

Apr 22, 2024

5.2.0

Apr 20, 2024

5.0.0 yanked

Mar 28, 2024

Reason this release was yanked:

broken polygonization

4.3.13

Jul 19, 2023

4.3.12

May 12, 2023

4.3.11

Apr 21, 2023

4.3.10

Apr 17, 2023

4.3.9

Mar 20, 2023

4.3.7

Mar 6, 2023

4.3.6

Feb 23, 2023

4.3.5

Feb 21, 2023

4.3.4

Feb 20, 2023

4.3.3

Feb 14, 2023

4.3.2

Feb 14, 2023

4.3.1

Feb 14, 2023

4.3.0

Feb 13, 2023

4.2.0

Aug 29, 2022

4.1.2

Apr 12, 2022

4.1.1

Apr 11, 2022

4.1.0

Apr 5, 2022

4.0.0

Feb 22, 2022

3.0.13

Apr 12, 2022

3.0.9

Feb 22, 2022

3.0.8

Feb 3, 2022

3.0.7

Jan 24, 2022

3.0.6

Nov 7, 2021

3.0.4

Jul 28, 2021

3.0.2

Jun 28, 2021

3.0.1

Jun 25, 2021

3.0.0.0b25 pre-release

May 26, 2021

3.0.0.0b24 pre-release

Apr 22, 2021

3.0.0.0b23 pre-release

Mar 18, 2021

This version

3.0.0.0b22 pre-release

Feb 22, 2021

3.0.0.0b21 pre-release

Feb 17, 2021

3.0.0.0b20 pre-release

Feb 11, 2021

3.0.0.0b20.dev7 pre-release

Feb 3, 2021

3.0.0.0b19 pre-release

Jan 7, 2021

2.0.8 yanked

Nov 25, 2019

2.0.5 yanked

May 14, 2019

2.0.4 yanked

May 14, 2019

2.0.3 yanked

May 14, 2019

2.0.2 yanked

May 14, 2019

2.0.1 yanked

Feb 25, 2019

2.0.0 yanked

Feb 25, 2019

1.0.1 yanked

Dec 11, 2018

1.0.0 yanked

Dec 10, 2018

0.9.16 yanked

Apr 20, 2018

0.9.15 yanked

Apr 18, 2018

0.9.14 yanked

Apr 18, 2018

0.9.13 yanked

Apr 17, 2018

0.9.12 yanked

Apr 17, 2018

0.9.11 yanked

Apr 17, 2018

0.9.10 yanked

Mar 2, 2018

0.9.9 yanked

Mar 2, 2018

0.9.8 yanked

Jan 8, 2018

0.9.7 yanked

Nov 1, 2017

0.9.6 yanked

Oct 31, 2017

Reason this release was yanked:

conflict with modern install tools

0.9.4 yanked

Jul 19, 2017

0.9.3 yanked

Jul 18, 2017

0.9.2 yanked

May 15, 2017

0.9.0 yanked

Nov 3, 2016

0.7.6 yanked

Jan 18, 2016

0.7.5 yanked

Dec 14, 2015

0.7.4 yanked

Nov 25, 2015

0.7.3 yanked

Nov 3, 2015

0.7.2 yanked

Oct 18, 2015

0.7.1 yanked

Sep 22, 2015

0.7.0 yanked

Sep 17, 2015

0.6.3 yanked

Sep 14, 2015

0.6.2 yanked

Sep 12, 2015

0.6.2.dev1 pre-release yanked

Sep 12, 2015

0.5.0 yanked

Sep 7, 2015

0.4.7 yanked

Aug 26, 2015

0.4.6 yanked

Aug 13, 2015

0.4.5 yanked

Jul 16, 2015

0.4.5.dev1 pre-release yanked

Aug 13, 2015

0.4.4 yanked

Jun 16, 2015

0.4.3 yanked

Jun 14, 2015

0.4.2 yanked

May 30, 2015

0.4.1 yanked

May 26, 2015

0.3.4 yanked

May 23, 2015

0.3.3 yanked

May 22, 2015

0.3.1.post11 yanked

May 22, 2015

0.3.1.post10 yanked

May 22, 2015

0.3.1 yanked

Apr 23, 2015

0.2.5 yanked

Apr 16, 2015

0.2.4 yanked

Apr 16, 2015

0.2.3 yanked

Apr 16, 2015

0.2.2 yanked

Apr 5, 2015

0.1.0 yanked

Mar 30, 2015

Reason this release was yanked:

conflict with modern install tools

0.1-dev pre-release yanked

Dec 16, 2013

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kraken-3.0.0.0b22.tar.gz (11.1 MB view details)

Uploaded Feb 22, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kraken-3.0.0.0b22-py3-none-any.whl (5.5 MB view details)

Uploaded Feb 22, 2021 Python 3

File details

Details for the file kraken-3.0.0.0b22.tar.gz.

File metadata

Download URL: kraken-3.0.0.0b22.tar.gz
Upload date: Feb 22, 2021
Size: 11.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.10

File hashes

Hashes for kraken-3.0.0.0b22.tar.gz
Algorithm	Hash digest
SHA256	`b48821819cd3143d59538103598d85899dc8c7151782bc660ae76db2e9b6e608`
MD5	`c31bce548a548b8459a2e8abf9272297`
BLAKE2b-256	`4d672f237a78154af8bfaefcf20d2bdc345a8f474bca53681e0a8a2432f26144`

See more details on using hashes here.

File details

Details for the file kraken-3.0.0.0b22-py3-none-any.whl.

File metadata

Download URL: kraken-3.0.0.0b22-py3-none-any.whl
Upload date: Feb 22, 2021
Size: 5.5 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.10

File hashes

Hashes for kraken-3.0.0.0b22-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6aa1083eb1aabfbc116fe91f8c900d5d9ed40347eaf086f177e9581e35413020`
MD5	`d8bff07fe015f13a214e7c93e6f5872c`
BLAKE2b-256	`f9231477ee1a7e09bf279e82081d0de3b085b144e5e06fe8cb45b8212b51f6e6`

See more details on using hashes here.

kraken 3.0.0.0b22

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Description

Installation

Quickstart

Documentation

Funding

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes