kraken·PyPI

OCR/HTR engine for all the languages

These details have not been verified by PyPI

Project links

Homepage

Project description

Description

https://github.com/mittagessen/kraken/actions/workflows/test.yml/badge.svg

kraken is a turn-key OCR system optimized for historical and non-Latin script material.

kraken’s main features are:

Fully trainable layout analysis, reading order, and character recognition

Right-to-Left, BiDi, and Top-to-Bottom script support

ALTO, PageXML, abbyyXML, and hOCR output

Word bounding boxes and character cuts

Multi-script recognition support

Public repository of model files

Variable recognition network architecture

Installation

kraken only runs on Linux or Mac OS X. Windows is not supported.

The latest stable releases can be installed from PyPi:

$ pip install kraken

If you want direct PDF and multi-image TIFF/JPEG2000 support it is necessary to install the pdf extras package for PyPi:

$ pip install kraken[pdf]

or install pyvips manually with pip:

$ pip install pyvips

Conda environment files are provided for the seamless installation of the main branch as well:

$ git clone https://github.com/mittagessen/kraken.git
$ cd kraken
$ conda env create -f environment.yml

or:

$ git clone https://github.com/mittagessen/kraken.git
$ cd kraken
$ conda env create -f environment_cuda.yml

for CUDA acceleration with the appropriate hardware.

Finally you’ll have to scrounge up a model to do the actual recognition of characters. To download the default model for printed French text and place it in the kraken directory for the current user:

$ kraken get 10.5281/zenodo.10592716

A list of libre models available in the central repository can be retrieved by running:

$ kraken list

Quickstart

Recognizing text on an image using the default parameters including the prerequisite steps of binarization and page segmentation:

$ kraken -i image.tif image.txt binarize segment ocr

To binarize a single image using the nlbin algorithm:

$ kraken -i image.tif bw.png binarize

To segment an image (binarized or not) with the new baseline segmenter:

$ kraken -i image.tif lines.json segment -bl

To segment and OCR an image using the default model(s):

$ kraken -i image.tif image.txt segment -bl ocr -m catmus-print-fondue-large.mlmodel

All subcommands and options are documented. Use the help option to get more information.

Documentation

Have a look at the docs.

Funding

kraken is developed at the École Pratique des Hautes Études, Université PSL.

This project was partially funded through the RESILIENCE project, funded from the European Union’s Horizon 2020 Framework Programme for Research and Innovation.

Received funding from the Programme d’investissements d’Avenir

Ce travail a bénéficié d’une aide de l’État gérée par l’Agence Nationale de la Recherche au titre du Programme d’Investissements d’Avenir portant la référence ANR-21-ESRE-0005 (Biblissima+).

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

5.3.0

Nov 21, 2024

5.2.9

Aug 27, 2024

5.2.8

Jul 24, 2024

5.2.7

Jul 8, 2024

5.2.6

Jul 3, 2024

5.2.5

May 23, 2024

5.2.4

May 9, 2024

5.2.3

May 5, 2024

5.2.2

Apr 30, 2024

5.2.1

Apr 22, 2024

5.2.0

Apr 20, 2024

5.0.0 yanked

Mar 28, 2024

Reason this release was yanked:

broken polygonization

4.3.13

Jul 19, 2023

4.3.12

May 12, 2023

4.3.11

Apr 21, 2023

4.3.10

Apr 17, 2023

4.3.9

Mar 20, 2023

4.3.7

Mar 6, 2023

4.3.6

Feb 23, 2023

4.3.5

Feb 21, 2023

4.3.4

Feb 20, 2023

4.3.3

Feb 14, 2023

4.3.2

Feb 14, 2023

4.3.1

Feb 14, 2023

4.3.0

Feb 13, 2023

4.2.0

Aug 29, 2022

4.1.2

Apr 12, 2022

4.1.1

Apr 11, 2022

4.1.0

Apr 5, 2022

4.0.0

Feb 22, 2022

3.0.13

Apr 12, 2022

3.0.9

Feb 22, 2022

3.0.8

Feb 3, 2022

3.0.7

Jan 24, 2022

3.0.6

Nov 7, 2021

3.0.4

Jul 28, 2021

3.0.2

Jun 28, 2021

3.0.1

Jun 25, 2021

3.0.0.0b25 pre-release

May 26, 2021

3.0.0.0b24 pre-release

Apr 22, 2021

3.0.0.0b23 pre-release

Mar 18, 2021

3.0.0.0b22 pre-release

Feb 22, 2021

3.0.0.0b21 pre-release

Feb 17, 2021

3.0.0.0b20 pre-release

Feb 11, 2021

3.0.0.0b20.dev7 pre-release

Feb 3, 2021

3.0.0.0b19 pre-release

Jan 7, 2021

2.0.8

Nov 25, 2019

2.0.5

May 14, 2019

2.0.4

May 14, 2019

2.0.3

May 14, 2019

2.0.2

May 14, 2019

2.0.1

Feb 25, 2019

2.0.0

Feb 25, 2019

1.0.1

Dec 11, 2018

1.0.0

Dec 10, 2018

0.9.16 yanked

Apr 20, 2018

0.9.15 yanked

Apr 18, 2018

0.9.14 yanked

Apr 18, 2018

0.9.13 yanked

Apr 17, 2018

0.9.12 yanked

Apr 17, 2018

0.9.11 yanked

Apr 17, 2018

0.9.10 yanked

Mar 2, 2018

0.9.9 yanked

Mar 2, 2018

0.9.8 yanked

Jan 8, 2018

0.9.7 yanked

Nov 1, 2017

0.9.6 yanked

Oct 31, 2017

Reason this release was yanked:

conflict with modern install tools

0.9.4 yanked

Jul 19, 2017

0.9.3 yanked

Jul 18, 2017

0.9.2 yanked

May 15, 2017

0.9.0 yanked

Nov 3, 2016

0.7.6 yanked

Jan 18, 2016

0.7.5 yanked

Dec 14, 2015

0.7.4 yanked

Nov 25, 2015

0.7.3 yanked

Nov 3, 2015

0.7.2 yanked

Oct 18, 2015

0.7.1 yanked

Sep 22, 2015

0.7.0 yanked

Sep 17, 2015

0.6.3 yanked

Sep 14, 2015

0.6.2 yanked

Sep 12, 2015

0.6.2.dev1 pre-release yanked

Sep 12, 2015

0.5.0 yanked

Sep 7, 2015

0.4.7 yanked

Aug 26, 2015

0.4.6 yanked

Aug 13, 2015

0.4.5 yanked

Jul 16, 2015

0.4.5.dev1 pre-release yanked

Aug 13, 2015

0.4.4 yanked

Jun 16, 2015

0.4.3 yanked

Jun 14, 2015

0.4.2 yanked

May 30, 2015

0.4.1 yanked

May 26, 2015

0.3.4 yanked

May 23, 2015

0.3.3 yanked

May 22, 2015

0.3.1.post11 yanked

May 22, 2015

0.3.1.post10 yanked

May 22, 2015

0.3.1 yanked

Apr 23, 2015

0.2.5 yanked

Apr 16, 2015

0.2.4 yanked

Apr 16, 2015

0.2.3 yanked

Apr 16, 2015

0.2.2 yanked

Apr 5, 2015

0.1.0 yanked

Mar 30, 2015

Reason this release was yanked:

conflict with modern install tools

0.1-dev pre-release yanked

Dec 16, 2013

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kraken-5.3.0.tar.gz (12.8 MB view details)

Uploaded Nov 21, 2024 Source

Built Distribution

kraken-5.3.0-py3-none-any.whl (5.0 MB view details)

Uploaded Nov 21, 2024 Python 3

File details

Details for the file kraken-5.3.0.tar.gz.

File metadata

Download URL: kraken-5.3.0.tar.gz
Upload date: Nov 21, 2024
Size: 12.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for kraken-5.3.0.tar.gz
Algorithm	Hash digest
SHA256	`6d92c8436bd4642a2f9af306732a54655160a6e51b0d3a3b023a5f17f5360409`
MD5	`3d6f4f1869c87c2634661d0a7674d565`
BLAKE2b-256	`2eb9d09ae3f08c53f189697c585a4e4c7691322421ed169581fe20923ba99725`

See more details on using hashes here.

File details

Details for the file kraken-5.3.0-py3-none-any.whl.

File metadata

Download URL: kraken-5.3.0-py3-none-any.whl
Upload date: Nov 21, 2024
Size: 5.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for kraken-5.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e46af09c8b5c68e6a5b50b0ab4224bd96534be3c91c54d54e41ddc5dd924be55`
MD5	`c3acced142b7b6cda8c0c83aeb65ac98`
BLAKE2b-256	`ca5d1932a4ac7f67ad8734ebb3e4b38d652a0f8b2b60b1f8a1ba6ddb2d2a7459`

See more details on using hashes here.

kraken 5.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Description

Installation

Quickstart

Documentation

Funding

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

kraken 5.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Description

Installation

Quickstart

Documentation

Related Software

Funding

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes