Information extraction and named-entity recognition for indexing PDFs
Project description
pdfner
Information extraction and named entity recognition for indexing PDFs
Install NLP tools
- Download language-specific model data in spaCy
$ python -m spacy download en
- Download Stanford CoreNLP from https://stanfordnlp.github.io/CoreNLP/download.html and extract to {project root}/pdfner/tests/tools
Install OCRmyPDF
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pdfner-0.1.0.tar.gz
(6.6 kB
view hashes)
Built Distribution
pdfner-0.1.0-py3-none-any.whl
(10.2 kB
view hashes)