Skip to main content

Dataset building and processing tools for deepdoctection

Project description

Deep Doctection Logo

deepdoctection-datasets

Categories and Datasets as well as some dataset instances for training models supported by deepdoctection.

Overview

dd-datasets is a package that provides comprehensive dataset management capabilities for Document AI tasks.

It includes:

  • datasets: Built-in dataset definitions and dataflow builders for popular document understanding datasets.
  • instances: Pre-defined dataset instances for common document understanding tasks such as object detection, text classifications and named entity recognition.

Installation

uv pip install dd-datasets

For using all datasets including those that require the xml-parsing tool lxml:

uv pip install dd-datasets[full]

License

Apache License 2.0

Author

Dr. Janis Meyer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dd_datasets-1.2.11.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dd_datasets-1.2.11-py3-none-any.whl (56.3 kB view details)

Uploaded Python 3

File details

Details for the file dd_datasets-1.2.11.tar.gz.

File metadata

  • Download URL: dd_datasets-1.2.11.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for dd_datasets-1.2.11.tar.gz
Algorithm Hash digest
SHA256 8afaf5f9830b0717b84c0937b19d4b2e7962623bb00c8493ecee8f732bdefe1c
MD5 84992c6ff2f5c93276282c455313b25c
BLAKE2b-256 441274430c43268b16fb2cda0570dd87dedc64b57aaeb79fa176f917940d5688

See more details on using hashes here.

File details

Details for the file dd_datasets-1.2.11-py3-none-any.whl.

File metadata

  • Download URL: dd_datasets-1.2.11-py3-none-any.whl
  • Upload date:
  • Size: 56.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for dd_datasets-1.2.11-py3-none-any.whl
Algorithm Hash digest
SHA256 69edfc088be872f5586016df1579caa3f348ea6f02b7aa83011252421e234cfb
MD5 847a351b0aceef832038bd6eafb7e42a
BLAKE2b-256 e2b601663e2c8142d282b2d964bfb3fd39f5d2171448bd5dabd65aac50bdda3c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page