A tool for preserving email in multiple preservation formats.
Project description
Mailbagit
A tool for creating and managing Mailbags, a package for preserving email in multiple formats. It contains an open specification for mailbags, as well as the mailbagit
and mailbagit-gui
tools for packaging email exports into mailbags.
mailbagit
can be used to convert native email formats, such as PST, MSG, EML, and MBOX into PDF, HTML, WARC, and other formats and combines them into stable packages for preservation.
Installation
pip install mailbagit
- To install PST dependancies:
pip install mailbagit[pst]
- To install
mailbagit-gui
:pip install mailbagit[gui]
Docker setup
You can also run mailbagit
using a Docker image.
docker pull ualbanyarchives/mailbagit
wget https://raw.githubusercontent.com/UAlbanyArchives/mailbagit/main/docker-compose.yml
docker compose run mailbagit
mailbagit -v
Quick start
Examples:
MSG files to PDF, EML, and WARC
mailbagit path/to/messages -i msg --derivatives eml pdf warc --mailbag_name my_mailbag
MBOX to PDF and plain text
mailbagit path/to/mbox_dir -i mbox -d txt pdf-chrome -m my_mailbag -r
PST to PDF, MBOX, EML, and WARC
mailbagit path/to/export.pst -i pst -d mbox eml pdf warc -m my_mailbag
EML to PDF and WARC in another directory
mailbagit path/to/messages -i eml -d pdf warc -m /path/to/my_mailbag
See the documentation for more details on:
Arguments
The arguments listed below can be entered in the command line when using mailbagit
or entered in mailbagit-gui
fields
Mandatory Arguments
- path:
A path to email to be packaged into a mailbag. This can be a single file or a directory containing a number of email exports.
- -m --mailbag:
A new directory for the mailbag, such as
/path/to/my_mailbag
, or justmy_mailbag
to use the same location as the source email. Must be a valid directory or file name and must not already exist.
- -i --input:
File format to use as input for a mailbag. Argument takes single input. e.g.
-i imap
or-i pst
- -d --derivatives:
Specifies a single or list of derivative formats that mailbagit will create and package into the mailbag. Argument takes multiple inputs. e.g.
-d eml pdf warc
Mailbagit Optional Arguments
- -v --version
Reports the version number and exits.
- -r --dry-run
Performs a test run that will not alter any files other than writing an error report. When this flag is used,
mailbagit
parses all the email it is provide and formats derivatives as much as it can without writing anything to disk. If there are any error or warnings, this will create an error report with anerrors.csv
listing all issues as well as a full stack trace in a.txt
file.
- -k --keep
Keeps the source files as-is and copies instead of moving them into a mailbag.
- --css
Path to a CSS file to override the included CSS when creating PDF or HTML derivatives Argument takes single file path as input.
- -c --compress
Compresses the mailbag as a ZIP, TAR, or TAR.GZ e.g.
-c zip
or-c tar.gz
- -f, --companion_files
Allows for companion metadata files to be packaged alongside email export files. When this option is used,
mailbagit
will recursively include all the files in the directory provided into a mailbag.
Bagit-python arguments
Mailbagit also accepts most bagit-python arguments. Thus, you can provide arguments like --processes 2
or arguments to add metadata such as --source-organization University at Albany, SUNY
The only bag-python arguments that mailbagit
does not support are -log
, -quiet
, -validate
, -fast
, and -completeness_only
If you would like to validate your mailbag, mailbagit
comes with bagit-python installed. Thus, you can run:
bagit.py --validate /path/to/mailbag
Development setup
git clone git@github.com:UAlbanyArchives/mailbagit.git
cd mailbagit
git switch develop
pip install -e .
Development with docker
-
This runs the dev docker image with the code installed in editable mode. You can then make code changes and run them directly with
mailbagit
. -
Assumes you have a directory with email data in ./sampleData. You can change this directory name in line 7 of docker-compose-dev.yml.
docker pull ualbanyarchives/mailbagit:dev
git clone git@github.com:UAlbanyArchives/mailbagit.git
cd mailbagit
git switch develop
docker-compose -f docker-compose-dev.yml run mailbagit
mailbagit -v
License
Kudos
This project was made possible by funding from the University of Illinois's Email Archives: Building Capacity and Community Project.
We owe a lot to the hard work that goes towards developing and maintaining the libraries mailbagit
uses to parse email formats and make bags. We'd like to thank these awesome projects, without which mailbagit
wouldn't be possible:
We'd also like to thank the RATOM project whose documentation was super helpful in guiding us though some roadblocks.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mailbagit-0.7.3.tar.gz
.
File metadata
- Download URL: mailbagit-0.7.3.tar.gz
- Upload date:
- Size: 47.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 340669f0e306974e9c340dce73a3115b05bb95c65843d516da97e379a3e4e740 |
|
MD5 | ce8af2ab62a134e93dde71b467d6fdcb |
|
BLAKE2b-256 | dd0cedf72765a58ee79cfca4165e9d39cce581c79f23748330bb47d0409a993f |
File details
Details for the file mailbagit-0.7.3-py3-none-any.whl
.
File metadata
- Download URL: mailbagit-0.7.3-py3-none-any.whl
- Upload date:
- Size: 60.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b2b6d9e8bd431024f4ff939738b78d0f5808eaac84e4c2c9f7794bd506f0a25 |
|
MD5 | 4a81522088932d8684672bae2eb5682c |
|
BLAKE2b-256 | 0ba5f5728f57322e2b6916bc573608ddc556a8a18277cf406cdef1de3091dbb4 |