Um leitor de documentos em Python para extrair campos, baseado em expressões regulares
Project description
document-reader
Leitor de documentos em Python para extrair campos, baseado em expressões regulares.
Instalação
pip install document-reader
Uso
from document_reader import Document, Field
doc = Document("pdf_file.pdf")
doc.register_fields(
Field(name="contract", regex=r"\d+/.*?/\d+", page=0),
Field(name="nup", regex=r"\d{5}\.\d{6}/\d{4}-\d{2}", page=1),
)
data = doc.open()
print(data)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
document_reader-0.0.6.tar.gz
(3.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file document_reader-0.0.6.tar.gz.
File metadata
- Download URL: document_reader-0.0.6.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef2e47bce44e514875a79c24a93fbebaa23693a034f53df2e14ad8d26007f830
|
|
| MD5 |
9cfb8f1204e66896e9b9cd62d1fd2279
|
|
| BLAKE2b-256 |
de656cba17d93d49f7f39a849c950a38e5011cce09e19ff5389a12d0a56a968d
|
File details
Details for the file document_reader-0.0.6-py3-none-any.whl.
File metadata
- Download URL: document_reader-0.0.6-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff7d0888b159f3b1a5e9b6f0f4909b8557722f5b3acc79fe48beb2920577347e
|
|
| MD5 |
83d7218f14d92deedd58984cb2b057e9
|
|
| BLAKE2b-256 |
de61c05a68fcfc42a34b52f976995c7ad5824c80b52a36e6f9407286e1000606
|