Um leitor de documentos em Python para extrair campos, baseado em expressões regulares
Project description
document-reader
Leitor de documentos em Python para extrair campos, baseado em expressões regulares.
Instalação
pip install document-reader
Uso
from document_reader import Document, Field
doc = Document("pdf_file.pdf")
doc.register_fields(
Field(name="contract", regex=r"\d+/.*?/\d+", page=0),
Field(name="nup", regex=r"\d{5}\.\d{6}/\d{4}-\d{2}", page=1),
)
data = doc.open()
print(data)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
document_reader-0.0.4.tar.gz
(3.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file document_reader-0.0.4.tar.gz.
File metadata
- Download URL: document_reader-0.0.4.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad5e36f0486e3040f7e66272d3e2836a6f0c346befff26a66e7030ff52422100
|
|
| MD5 |
4ea3de4115a25cb00a2567b150c1af6b
|
|
| BLAKE2b-256 |
593bd236c936b73306876e88fd549e25ccd938b507fc6467b19c7e130681cebf
|
File details
Details for the file document_reader-0.0.4-py3-none-any.whl.
File metadata
- Download URL: document_reader-0.0.4-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87cb3b46b39b3f70b4c03c95be9ee7ee38033a37bc993c28d0560d1bdcddf121
|
|
| MD5 |
74068b2d924274426715191c0ebee251
|
|
| BLAKE2b-256 |
783959b7309bacaf508eb792f4f94d8fb9491d8edc78f903cf90d9ecea9de2ff
|