🔧 Tools to automate your document understanding tasks.
Project description
Document Tools
🔧 Tools to automate your document understanding tasks.
This package contains tools to automate your document understanding tasks by leveraging the power of 🤗 Datasets and 🤗 Transformers.
With this package, you can (or will be able to):
- 🚧 Create a dataset from a collection of documents.
- ✅ Transform a dataset to a format that is suitable for training a model.
- 🚧 Train a model on a dataset.
- 🚧 Evaluate the performance of a model on a dataset of documents.
- 🚧 Export a model to a format that is suitable for inference.
Features
This project is under development and is in the alpha stage. It is not ready for production use, and if you find any bugs or have any suggestions, please let us know by opening an issue or a pull request.
Featured models
- ❌ DiT
- ✅ LayoutLMv2
- ✅ LayoutLMv3
- ✅ LayoutXLM
Usage
One-liner to get started:
from datasets import load_dataset
from document_tools import tokenize_dataset
# Load a dataset from 🤗 Hub
dataset = load_dataset("deeptools-ai/test-document-invoice", split="train")
# Tokenize the dataset
tokenized_dataset = tokenize_dataset(dataset, target_model="layoutlmv3")
For more information, please see the documentation
Credits
This package was created with Cookiecutter and the waynerv/cookiecutter-pypackage project template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file document-tools-0.1.2.tar.gz
.
File metadata
- Download URL: document-tools-0.1.2.tar.gz
- Upload date:
- Size: 11.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d17c79e57176b5eb281a18baa019e3dd90da9d41dd65cada046250c37dea547 |
|
MD5 | b0afb84955012649c88baf50262abac7 |
|
BLAKE2b-256 | 9d357d19219654a50ba2bd3b9c9c18609ce73bbd01f98c0683be3f0049331d3a |
File details
Details for the file document_tools-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: document_tools-0.1.2-py3-none-any.whl
- Upload date:
- Size: 10.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2df9e58fca2081a0a7dbb508132cd8ca6cd792b28e44e7df41594b2a0722b0f |
|
MD5 | ce01b2013313b255a722c65ee0ac1922 |
|
BLAKE2b-256 | 201103d9e1b9d6d4d784e8d0f1d3f5781201dc9881cb5cbbbe5f3c332e13c56d |