Skip to main content

Smart text extraction from PDF documents

Project description

Tests Documentation PyPI Codecov DOI

EDS-PDF

EDS-PDF provides modular framework to extract text from PDF documents.

You can use it out-of-the-box, or extend it to fit your use-case.

Getting started

Install the library with pip:

$ pip install edspdf

Visit the documentation for more information!

Citation

If you use EDS-NLP, please cite us as below.

@software{edspdf,
  author  = {Dura, Basile and Wajsburt, Perceval and Calliger, Alice and Gérardin, Christel and Bey, Romain},
  license = {BSD-3-Clause},
  title   = {{EDS-PDF: Smart text extraction from PDF documents}},
  url     = {https://github.com/aphp/edspdf}
}

Acknowledgement

We would like to thank Assistance Publique – Hôpitaux de Paris and AP-HP Foundation for funding this project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edspdf-0.5.1.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

edspdf-0.5.1-py3-none-any.whl (21.9 kB view details)

Uploaded Python 3

File details

Details for the file edspdf-0.5.1.tar.gz.

File metadata

  • Download URL: edspdf-0.5.1.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.8.10 Linux/5.15.0-1014-azure

File hashes

Hashes for edspdf-0.5.1.tar.gz
Algorithm Hash digest
SHA256 5110e22a2ed205b89d03232126cc7b651490a8c4a9819e2f6925cf372fc06c52
MD5 613df45c2f8a45577825b5848d783af6
BLAKE2b-256 e1960172632c5cd5e90ab0daa8bb8d7b0937957b498cd6d40d6107bf11ed98f0

See more details on using hashes here.

File details

Details for the file edspdf-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: edspdf-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 21.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.8.10 Linux/5.15.0-1014-azure

File hashes

Hashes for edspdf-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7139958d7636c41ce372ec872b35067ef0cd651ba21aa827ef64638130d86a85
MD5 187aaecd03efd58d02086b34fedd296c
BLAKE2b-256 8c27a03c8ec64f286b2cd30b3e602df45cd0519787072d69a197f48c780d9f02

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page