Skip to main content

A wrapper library for interacting with the MPS ingest system for IIIF resources

Project description

IIIF LTS Library

This is a Python library which facilitates interacting with the Harvard LTS (Library Technology Services) media ingest solution, which takes images and metadata and serves them via IIIF at scale. IIIFingest helps other applications manage ingest credentials, upload images to S3 for ingest, create IIIF manifests, create asset ingest requests, generate JWT tokens for signing ingest requests, and track the status of ingest jobs.

Getting Started

Requirements

Requires libmagic to be installed on the machine running IIIFingest for handling file object uploads. On Mac, this can be installed with

brew install libmagic

On Ubuntu, it can be installed with

apt install libmagic-dev

Installation

To install via pypi.org:

pip install IIIFingest

Or from source:

pip install git+https://github.com/Harvard-ATG/lts-iiif-ingest-service.git

Requirements

  • Python 3.7+

Using the library

import boto3
import logging
from IIIFingest.auth import Credentials
from IIIFingest.client import Client

# Enable debug logging
logging.basicConfig()
logging.getLogger('IIIFingest').setLevel(logging.DEBUG)

# Configure ingest API auth credentials
jwt_creds = Credentials(
    issuer="atissuer",
    kid="atissuerdefault",
    private_key_path="path/to/private.key"
)

# Configure ingest API client
client = Client(
    account="ataccount",
    space="atspace",
    namespace="at",
    environment="qa",
    asset_prefix="myapp",
    jwt_creds=jwt_creds,
    boto_session=boto3.Session(),
)

# Define images to ingest
images = [{
    "label": "Test Image", 
    "filepath": "tests/images/mcihtest1.tif"
}]

# Define manifest-level metadata
manifest_level_metadata = {
    "labels": ["Test Manifest"],
    "summary": "A test manifest for ingest",
    "rights": "http://creativecommons.org/licenses/by-sa/3.0/",
}

# Call client methods as needed
assets = client.upload(images)
manifest = client.create_manifest(manifest_level_metadata=manifest_level_metadata, assets=assets)
result = client.ingest(manifest=manifest, assets=assets)
status = client.jobstatus(result["job_id"])

Authentication

The ingest API requires JWT tokens for authentication and authorization. The credentials needed to generate tokens are provided by LTS at registration time and can then be used with this library.

The Credentials constructor can be configured as follows:

  • issuer: Identifies the app or service issuing tokens.
  • kid: Identifies the key used for signing tokens. Ingest permissions are associated with the key.
  • private_key_path: Path to the private key provided by LTS for the given issuer.
  • private_key_string: The private key value as a string.
  • expiration: The length of time in seconds for which the token should be valid (default: 3600).

Note that the private_key_path and private_key_string options are mutually exclusive.

Client Configuration

The Client constructor may be configured with the following options:

  • account: The account identifier.
  • space: The space within the account.
  • namespace: The NRS namespace for manifest and asset URLs.
  • agent: Name of the agent that created the assets.
  • environment: Ingest API environment: dev, qa, or prod.
  • asset_prefix: Optional prefix to use for image asset IDs (e.g. the application name).
  • jwt_creds: A Credentials instance for generating JWT tokens.
  • boto_session: A boto3.session.Session instance with permission to upload images to the S3 ingest bucket.

Notes:

  • LTS will provide the account, space, namespace, and agent values.
  • LTS will provide the AWS credentials needed to upload images to S3. It's up to you how S3 credentials are managed, the only requirement is that a boto session is provided to the library.
  • To make requests to non-prod environments (dev or qa), the client must be on VPN or the IP must be whitelisted. If the requests are coming from a cloud account, make sure to whitelist the IP range.

Documentation & References

See the following LTS documentation for more details:

Development

Getting setup

Clone repo:

$ git clone git@github.com:Harvard-ATG/lts-iiif-ingest-service.git
$ cd lts-iiif-ingest-service

Setup python environment:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -e ".[dev]"

Note that this will install all packages needed for the library as well as linting and pre-commit hooks.

Install pre-commit hooks:

This installs pre-commit git hooks to automatically lint and format code before committing. See the .pre-commit-config.yaml for specifics.

$ pre-commit install

Testing

To run unit tests:

$ pytest tests/unit

To run functional tests (tests which hit the dev bucket via AWS cli):

$ pytest tests/functional

Note:

  • To run functional test you will need to provide a TEST_AWS_PROFILE in your .env file
  • You can specify a specific function via pytest tests/unit/test_bucket.py::<functionname>

PyPi release

// VERSION = 1.0.4.1, 1.0.5, etc
$ pip install twine
$ python3 -m build # uses PyPA `build` tool: https://packaging.python.org/en/latest/tutorials/packaging-projects/
$ tar tzf dist/IIIFingest-{VERSION}.tar.gz
$ twine check dist/*
$ twine upload dist/IIIFingest-{VERSION}*

If you are using an API key with PyPi, your username is __token__. You can create a $HOME/.pypirc file to avoid needing to copy & paste that token (see docs).

Managing auth credentials

There are two types of auth credentials that need to be managed:

  1. Ingest API credentials.
  2. AWS S3 credentials.

Note that the suggestions here pertain to local development only. For production environments, it's recommended to use environment variables injected via SSM or other secret management techniques.

Ingest API credentials:

For local development, you may find it convenient to store credentials and configuration in a folder named auth either in this repository or separately (auth is git ignored by default). The directory structure should look something like this:

auth
├── dev
│   ├── issuers
│   │   ├── atapp1.json
│   │   ├── atapp2.json
│   │   └── atapp3.json
│   └── keys
│       ├── atapp1
│       │   └── atapp1default
│       │       ├── private.key
│       │       └── public.key
│       ├── atapp2
│       │   └── atapp2default
│       │       ├── private.key
│       │       └── public.key
│       └── atapp3
│           └── atapp3efault
│               ├── private.key
│               └── public.key
├── qa
└── prod

With this approach, you can set the path in the Credentials constructor:

private_key_path = "auth/{env}/keys/{issuer}/{issuer}default/private.key"

AWS S3 credentials:

For local development, you may choose to create a profile in ~/.aws/credentials and reference that profile when creating the boto3.Session or set environment variables that boto3 can automatically load.

License

Apache 2.0, see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

IIIFingest-1.3.0.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

IIIFingest-1.3.0-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file IIIFingest-1.3.0.tar.gz.

File metadata

  • Download URL: IIIFingest-1.3.0.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.0

File hashes

Hashes for IIIFingest-1.3.0.tar.gz
Algorithm Hash digest
SHA256 93166043094d35f09d3b8588cdac33f042172c51c6975e60823763147a04c98c
MD5 b69f5860dcb8b7cf4f0fc95060492dbb
BLAKE2b-256 4a9f8494d8a2442dbe7dbe85e3e5f73834fdf9d6c366d2159d8582d0b43947ef

See more details on using hashes here.

File details

Details for the file IIIFingest-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: IIIFingest-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.0

File hashes

Hashes for IIIFingest-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bf7cf2e08319666b6fd0b9a784f2b62bbd7a95cb095d734679f851b942d70f47
MD5 18a222ff0bc03a9763ec97ef792ed675
BLAKE2b-256 a883027dad1c3a1881708a18ea73df4a636b844509bdcd1d9919451464878fdb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page