Skip to main content

Lightweight Archivematica

Project description

Downloads Versions Updates

a3m

a3m is a lightweight version of Archivematica focused on AIP creation. It has neither external dependencies, integration with access sytems, search capabilities nor a graphical interface. It is ideal for workloads where you would typically use multiple Archivematica pipelines and implement additional workflows elsewhere.

Status

Experimental, a3m is still being refined. See open and closed issues.

Usage

You can install a3m via PyPI:

pip install a3m

However, it is preferably to run a3m via our Docker image because it includes all the dependencies needed (unar, 7z, ffmpeg, clamav, etc...).

gRPC server

The following example shows how to set up a gRPC server and a client sharing the same network using Docker. Alternatively, see our screencast.

Create a virtual network:

docker network create a3m-network

The following command will run the gRPC server in detached mode listening locally on port 7000:

docker run --rm --detach --name a3m --network a3m-network -p 7000:7000 docker.pkg.github.com/artefactual-labs/a3m/a3m:main

Submit a transfer with the gRPC client, e.g.:

docker run --rm --network a3m-network --entrypoint=python docker.pkg.github.com/artefactual-labs/a3m/a3m:main -m a3m.server.rpc.client submit --wait --address=a3m:7000 https://github.com/artefactual/archivematica-sampledata/raw/master/SampleTransfers/ZippedDirectoryTransfers/DemoTransferCSV.zip

Using our service definition, it is possible to generate client-side code in multiple programming languages. See gRPC concepts for more.

Don't forget to clean up before leaving!

docker stop a3m
docker network remove a3m-network
Embedded API

Python developers should be able to implement new solutions embedding a3m as a library. See #42 for more.

import a3m

runner = a3m.Runner()
runner.submit_package("https://...", wait=True)

Development

It is possible to do local development work in a3m. But we also provide an environment based on Docker Compose with all the tools and dependencies installed so you don't have to run them locally.

Docker Compose

Try the following if you feel confortable using our Makefile:

make create-volume build bootstrap restart

Otherwise, follow these steps:

# Create the external data volume
mkdir -p hack/compose-volume
docker volume create --opt type=none --opt o=bind --opt device=./hack/compose-volume a3m-pipeline-data

# Build service
env COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose build

# Bring the service up
docker-compose up -d a3m

You're ready to submit a transfer:

# Submit a transfer
docker-compose run --rm --entrypoint sh a3m -c "python -m a3m.server.rpc.client submit --wait --address=a3m:7000 https://github.com/artefactual/archivematica-sampledata/raw/master/SampleTransfers/ZippedDirectoryTransfers/DemoTransferCSV.zip"

# Find the AIP generated
find hack/compose-volume -name "*.7z";
Container-free workflow

Be aware that a3m has application dependencies that need to be available in the system path. The Docker image makes them all available while in this workflow you will have to ensure they're available manually.

a3m needs Python 3.8 or newer. So for an Ubuntu/Debian Linux environment:

sudo apt install -y python3.8 python3.8-venv python3.8-dev

The following external tools are used to process files in a3m and must be installed on your system. For an Ubuntu/Debian Linux environment:

Siegfried

wget -qO - https://bintray.com/user/downloadSubjectPublicKey?username=bintray | sudo apt-key add - 

echo "deb http://dl.bintray.com/siegfried/debian wheezy main" | sudo tee -a /etc/apt/sources.list 

sudo apt-get update && sudo apt-get install siegfried

unar

sudo apt-get install unar

ffmpeg (ffprobe)

sudo apt-get install ffmpeg

ExifTool

https://packages.archivematica.org/1.11.x/ubuntu-externals/pool/main/libi/libimage-exiftool-perl/libimage-exiftool-perl_10.10-2~14.04_all.deb`

sudo dkpg -i libimage-exiftool-perl_10.10-2~14.04_all.deb

MediaInfo

sudo apt-get install mediainfo

Sleuthkit (fiwalk)

sudo apt-get install sleuthkit

Jhove

DEPENDENCIES: sudo apt-get ca-certificates-java java-common openjdk-8-jre-headless

https://packages.archivematica.org/1.11.x/ubuntu-externals/pool/main/j/jhove/jhove_1.20.1-6~18.04_all.deb

sudo dpkg -i jhove_1.20.1-6~18.04_all.deb

7-Zip

sudo apt-get install pzip-full

atool

sudo apt-get install atool

test

sudo apt-get install coreutils

Check that usr/bin is present in your system path (echo $PATH) and that each tool is available from there (which [toolname])

Check out this repository:

git clone --depth 1 https://github.com/artefactual-labs/a3m.git

Then follow these steps:

# Create virtual environment and activate it
virtualenv --python=python3.8 .venv
source .venv/bin/activate

# Install the dependencies
pip install -r requirements-dev.txt

# Run the tests:
pytest

# Run a3m server
python -m a3m

Start a new transfer:

$ python -m a3m.server.rpc.client submit --wait https://github.com/artefactual/archivematica-sampledata/raw/master/SampleTransfers/ZippedDirectoryTransfers/DemoTransferCSV.zip
Submitting...
Transfer created: 0f667867-800a-466f-856f-fea5980f1d97

You can find both the database and the shared directory under ~/.local/share/a3m/.

Other things you can do:

Python debugging with pdb

Stop a3m if it's already running:

docker-compose stop a3m

Introduce a breakpoint in the code. Breakpoints can be used anywhere, including client modules.

breakpoint()  # Add this!
important_code()

Run a3m as follows:

docker-compose run --rm --publish=52000:7000 a3m

The debugger should activate as your breakpoint is reached. Use commands to control the debugger, e.g. help.

Enable the debug mode

a3m comes with a pre-configured logger that hides events with level INFO or lower. INFO is bloated, so we use WARNING and higher.

Set the A3M_DEBUG environment string to see all events. The string can be injected in several ways, e.g.:

docker-compose run --rm -e A3M_DEBUG=yes --publish=52000:7000 a3m

The logging configuration lives in a3m.settings.common.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

a3m-0.2.0.tar.gz (593.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

a3m-0.2.0-py2.py3-none-any.whl (669.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file a3m-0.2.0.tar.gz.

File metadata

  • Download URL: a3m-0.2.0.tar.gz
  • Upload date:
  • Size: 593.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.9

File hashes

Hashes for a3m-0.2.0.tar.gz
Algorithm Hash digest
SHA256 cdf8178d2cfc7fe39cc9456d6b2d05856538a16150409ff904dd966980311b9c
MD5 97f98ea4cf2a248e29d5a04aea04731a
BLAKE2b-256 4c325628091174eba45cfc19cda2826e4c86e2f1696200a09aad7d87e04fa505

See more details on using hashes here.

File details

Details for the file a3m-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: a3m-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 669.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.9

File hashes

Hashes for a3m-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ac0b16991b5d9bf586d5101bbbab3b97d4c8d97a6324cce65b4e86b0b97bc1ff
MD5 7752963270568869eef769d06b27d60c
BLAKE2b-256 9c3cdb867f363d85b017e916e3c848447853e90e61a3f04cefc1c9bd37ba2f5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page