Skip to main content

Unofficial API client for the Transkribus project

Project description

Transkribus API Client

transkribus-client provides a Python 3.6+ API client to interact with Transkribus.

Authentication

Most of the API requires an authentication with a Transkribus account. To authenticate, you can give your email and password to the client:

from getpass import getpass
from transkribus import TranskribusAPI
api = TranskribusAPI()
TranskribusAPI.login('user@example.com', getpass())

Alternatively, you can use the options_from_env helper and some environment variables:

from transkribus import TranskribusAPI, options_from_env
api = TranskribusAPI(**options_from_env())

You can define the following environment variables:

TRANSKRIBUS_API_URL : Base URL of the Transkribus API. Defaults to https://transkribus.eu/TrpServer/rest.

TRANSKRIBUS_EMAIL : Email address of the user to authenticate with.

TRANSKRIBUS_PASSWORD : Password of the user to authenticate with.

Usage

Browsing from collections to transcripts

from transkribus.api import TranskribusAPI, options_from_env
from transkribus.models import Collection
api = TranskribusAPI(**options_from_env())
for collection_data in api.list_collections():
    for document in Collection(collection_data).get_documents(api):
        for page in document.get_pages(api):
            print(str(page.get_transcript()))

Exporting a collection

collection = Collection(COLLECTION_ID)
export_job = collection.export(api)
export_job.wait_for_result(api)
export_job.download_result('path/to/export.zip')

Parsing a PageXML file

from transkribus.pagexml import PageXmlPage
for region in PageXmlPage('/path/to/transcript.xml').page.text_regions:
    for line in region.lines:
        print(line.text)

Contributing

Issues and patches are welcome! Here are some tips to help you get started coding.

Unit tests

We use tox for unit tests. You can install and run it like so:

pip install tox
tox

Linting

We use pre-commit with black to automatically format the Python source code of this project.

To be efficient, you should run pre-commit before committing (hence the name…).

To do that, run once:

pip install pre-commit
pre-commit install

The linting workflow will now run on modified files before committing, and will fix issues for you.

If you want to run the full workflow on all the files: pre-commit run -a.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transkribus-client-0.3.3.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

transkribus_client-0.3.3-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file transkribus-client-0.3.3.tar.gz.

File metadata

  • Download URL: transkribus-client-0.3.3.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.13

File hashes

Hashes for transkribus-client-0.3.3.tar.gz
Algorithm Hash digest
SHA256 d469c990078c360f1292dc12e748c2742eae55389e702eeacaed9e36ad661619
MD5 13353ac9bbe6ba885678d5035675982f
BLAKE2b-256 ee5c4c886937894bdaf0e365f7768788667ad3aab52dc69b139544944c5075b9

See more details on using hashes here.

File details

Details for the file transkribus_client-0.3.3-py3-none-any.whl.

File metadata

File hashes

Hashes for transkribus_client-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d3bc9eff2b7b53b2a5dda91a9a5fd11b20d62b7dabdab643133783fc4257dec3
MD5 58fe401f3d206924ea4f589f93a09203
BLAKE2b-256 29410a75be5bce6669af1cd320364ccacb7c0a8dfff081d6dbbbf72f43050739

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page