Skip to main content

Dataset building and processing tools for deepdoctection

Project description

Deep Doctection Logo

deepdoctection-datasets

Categories and Datasets as well as some dataset instances for training models supported by deepdoctection.

Overview

dd-datasets is a package that provides comprehensive dataset management capabilities for Document AI tasks.

It includes:

  • datasets: Built-in dataset definitions and dataflow builders for popular document understanding datasets.
  • instances: Pre-defined dataset instances for common document understanding tasks such as object detection, text classifications and named entity recognition.

Installation

uv pip install dd-datasets

For using all datasets including those that require the xml-parsing tool lxml:

uv pip install dd-datasets[full]

License

Apache License 2.0

Author

Dr. Janis Meyer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dd_datasets-1.2.3.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dd_datasets-1.2.3-py3-none-any.whl (56.3 kB view details)

Uploaded Python 3

File details

Details for the file dd_datasets-1.2.3.tar.gz.

File metadata

  • Download URL: dd_datasets-1.2.3.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for dd_datasets-1.2.3.tar.gz
Algorithm Hash digest
SHA256 e395151e9c48364f491affead2f8af2fda2148b9baa49ef7704a787745b0cc74
MD5 c2928eb6746898b25502b4dae65aaafb
BLAKE2b-256 43b551988ad7ef74d3e9045019024d5de05ab967776d23585b9f5e0c0aaff33f

See more details on using hashes here.

File details

Details for the file dd_datasets-1.2.3-py3-none-any.whl.

File metadata

  • Download URL: dd_datasets-1.2.3-py3-none-any.whl
  • Upload date:
  • Size: 56.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for dd_datasets-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ef55d3556006c9e6f7bd6012e28b0d14741840fe5b648d9da88a6be0b17a0b84
MD5 7e4297dfc3f7ba32c406d2c598fe4120
BLAKE2b-256 9d372fd3457867cbaba24145696ac14afbdf027779c60479106d577935f957c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page