Extracts email metadata and text from a PDF file
Project description
pdf2mbox
a command-line utility and Python package for converting PDF emails to MBOX format
Installation
pip install pdf2mbox
Usage
# from the command line
% python -m pdf2mbox --help
usage: pdf2mbox.py [-h] [--version] [--overwrite] [--csv [CSV]]
pdf_file [mbox_file]
Generates an mbox from a PDF containing emails
positional arguments:
pdf_file PDF file provided as input
mbox_file Mbox file generated as output
optional arguments:
-h, --help show this help message and exit
--version, -v show program's version number and exit
--overwrite, -o overwrite MBOX file if it exists
--csv [CSV] generate CSV file output
# from within python
from pdf2mbox import pdf2mbox
pe = pdf2mbox(pdf_file, mbox_file) # pe contains dict of emails
Notes
- The initial development of this package was funded in part by The Mellon Foundation’s “Email Archives: Building Capacity and Community” program.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pdf2mbox-0.3.3.tar.gz
(3.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf2mbox-0.3.3.tar.gz.
File metadata
- Download URL: pdf2mbox-0.3.3.tar.gz
- Upload date:
- Size: 3.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f916980c8133949270f49d89b527df687d9ca64359cc4eb774c6dbb03a8c405
|
|
| MD5 |
4dbd6c99a59284beb7828a332bdbc07e
|
|
| BLAKE2b-256 |
2f7a84bd8d1f1685bbad8bbb13f1251759c363223a2304535dbc1f757c07834f
|
File details
Details for the file pdf2mbox-0.3.3-py3-none-any.whl.
File metadata
- Download URL: pdf2mbox-0.3.3-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12ecd0aad6fba17453dd40e5a405925eba0d39b002c8a2c8639280cecb693390
|
|
| MD5 |
1838505d05d57b99ac7b6e0ee70f1603
|
|
| BLAKE2b-256 |
4b9ea2184e4098970f34977df3a22136ac731746df6e5b71acd4fa961b00c5f6
|