Skip to main content

🔧 Tools to automate your document understanding tasks.

Project description

Document Tools

pypi python Build Status codecov

🔧 Tools to automate your document understanding tasks.

This package contains tools to automate your document understanding tasks by leveraging the power of 🤗 Datasets and 🤗 Transformers.

With this package, you can (or will be able to):

  • 🚧 Create a dataset from a collection of documents.
  • Transform a dataset to a format that is suitable for training a model.
  • 🚧 Train a model on a dataset.
  • 🚧 Evaluate the performance of a model on a dataset of documents.
  • 🚧 Export a model to a format that is suitable for inference.

Features

This project is under development and is in the alpha stage. It is not ready for production use, and if you find any bugs or have any suggestions, please let us know by opening an issue or a pull request.

Featured models

Usage

One-liner to get started:

from datasets import load_dataset
from document_tools import tokenize_dataset

# Load a dataset from 🤗 Hub
dataset = load_dataset("deeptools-ai/test-document-invoice", split="train")

# Tokenize the dataset
tokenized_dataset = tokenize_dataset(dataset, target_model="layoutlmv3")

For more information, please see the documentation

Credits

This package was created with Cookiecutter and the waynerv/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

document-tools-0.1.2.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

document_tools-0.1.2-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file document-tools-0.1.2.tar.gz.

File metadata

  • Download URL: document-tools-0.1.2.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for document-tools-0.1.2.tar.gz
Algorithm Hash digest
SHA256 1d17c79e57176b5eb281a18baa019e3dd90da9d41dd65cada046250c37dea547
MD5 b0afb84955012649c88baf50262abac7
BLAKE2b-256 9d357d19219654a50ba2bd3b9c9c18609ce73bbd01f98c0683be3f0049331d3a

See more details on using hashes here.

File details

Details for the file document_tools-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for document_tools-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e2df9e58fca2081a0a7dbb508132cd8ca6cd792b28e44e7df41594b2a0722b0f
MD5 ce01b2013313b255a722c65ee0ac1922
BLAKE2b-256 201103d9e1b9d6d4d784e8d0f1d3f5781201dc9881cb5cbbbe5f3c332e13c56d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page