Skip to main content

Python library and CLI tool to download Archivematica METS files

Project description

mets-retriever

About

CLI tool for bulk downloading Archivematica METS files.

Unlike Gloria the flat-coated retriever mix, mets-retriever is all about fetching.

Gloria, the flat-coated retriever mix

Usage

After installing with pip (see below), use the retrieve-mets command.

retrieve-mets has two subcommands, fetch-all and fetch-one. Both subcommands have some common arguments:

  • METS files are fetched to a directory specified with the --output-dir argument. If one is not provided, a mets_files directory will be created in the current directory and METS files will be written there.
  • Storage Service credentials must be included using the --ss-url and --ss-api-key arguments for both commands. By default these default to values from the Archivematica Docker development environment.
  • If the --sidecar flag is passed, a sidecar txt file will be written alongside each METS file in the output directory with additional metadata about the AIP not found in the METS file, namely, the storage location UUID and the UUIDs of any AIP replicas.
Usage: retrieve-mets [OPTIONS] COMMAND [ARGS]...

  METS Retriever CLI tool

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  fetch-all  Fetch all METS files not already retrieved.
  fetch-one  Fetch single METS file, even if it's already been retrieved.

fetch-all

To fetch all AIP METS files that have not already been retreived, use the fetch-all subcommand. E.g.:

retrieve-mets fetch-all

Once a METS file is fetched, its UUID is stored in a local SQLite database so that it will not be fetched again on subsequent runs.

This command accepts several optional arguments:

Usage: retrieve-mets fetch-all [OPTIONS]

  Fetch all METS files not already retrieved.

Options:
  --ss-url TEXT         Storage Service host URL  [default:
                        http://127.0.0.1:62081; required]
  --ss-api-key TEXT     Storage Service API key  [default: test; required]
  --output-dir TEXT     Path to output directory  [default: mets_files;
                        required]
  --sidecar             Write sidecar file for each METS with Storage Location
                        and AIP replica UUIDs
  --with-replicas-only  Only retrieve METS for an AIP if a replica has also
                        been stored
  --help                Show this message and exit.

fetch-one

To fetch (or re-fetch) a single AIP METS file, use the fetch-one subcommand. E.g.:

retrieve-mets fetch-one 68ee3c66-d90a-4b9a-a33c-2e4e6d339ff7

This command accepts several optional arguments:

Usage: retrieve-mets fetch-one [OPTIONS] AIP_UUID

  Fetch single METS file, even if it's already been retrieved.

Options:
  --ss-url TEXT      Storage Service host URL  [default:
                     http://127.0.0.1:62081; required]
  --ss-api-key TEXT  Storage Service API key  [default: test; required]
  --output-dir TEXT  Path to output directory  [default: mets_files; required]
  --sidecar          Write sidecar file for each METS with Storage Location
                     and AIP replica UUIDs
  --help             Show this message and exit.

Install

Install mets-retriever package

mets-retriever requires Python 3.6+.

Via PyPI

pip install mets-retriever

Manually

Download this repo:

git clone https://github.com/artefactual-labs/mets-retriever.git

Change into the cloned directory and install:

cd mets-retriever/
pip install .

Development

Installation

For development, it may be useful to install mets-retriever with pip install -e ., which will apply changes made to the source code immediately.

Testing

To run all tests with tox: tox

Or run tests directly with pytest:

pip install -r requirements/test.txt
pytest

Publishing to PyPI

This repository contains a Makefile with commands to aid in building packages and publishing to PyPI.

To check that the package is valid:

make package-check

To upload the package to PyPI (this requires PyPI credentials and being listed as a collaborator on the auditmatica project):

make package-upload

To clean up package distribution files:

make clean

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mets-retriever-0.2.0.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mets_retriever-0.2.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file mets-retriever-0.2.0.tar.gz.

File metadata

  • Download URL: mets-retriever-0.2.0.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.4

File hashes

Hashes for mets-retriever-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a6586ec3c70987a1cba28dbf72b1240ab5e5828a160716d6e133e3c6e2e50007
MD5 0557f7179920f49f623fba14ff0f4a72
BLAKE2b-256 c3f4f83b0c8046c09827a9dde8e2b8ec519aaf82bb5025b10f4daa328eae05be

See more details on using hashes here.

File details

Details for the file mets_retriever-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mets_retriever-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.4

File hashes

Hashes for mets_retriever-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9b829e849fdb1b22722a5ac54256abd32aa52a6182c164714735eee5ce99abdd
MD5 568a6c2d8ba2a11de25188548ba43d5f
BLAKE2b-256 02ea4bb379497c1a6ec4f0f09e1b3b749e47a734a5b9da5d8fbeaaf5600576a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page