Skip to main content

Campaign management scripts for remote data access

Project description

License Documentation

Documentation

Documentation is hosted at readthedocs. See Installation for installation instructions. It can be installed by pip3 install hpc-campaign but needs to be configured before first use.

hpc-campaign

HPC Campaign Management is a set of Python scripts for creating small metadata files about large datasets in one or more locations, which can be shared among project users, and which refer back to the real data.

A Campaign Archive file can contain

  • metadata of ADIOS2 BP5 datasets
  • metadata of HDF5 files
  • images (stored inside the campaign archive, or reference to a remote image)
  • text files (stored compressed in campaign archive, or reference to a remote text file)

Each dataset can have multiple replicas, in multiple host/directory locations. Datasets are assigned a unique identifier at first insert and then all history of replicas can be tracked by the unqiue identifier. Special locations can be designated as archives, meaning they are not directly accessible but first have to be restored to some location.

Campaign management requires an I/O solution to support

  • extracting metadata from datasets
  • handling the metadata file (.ACA) as a supported file format
  • understand that the data is remote and support remote data access for actual data operations.

Currently, this toolkit is being developed for the ADIOS I/O framework <https://adios2.readthedocs.io/>_, however, the intention is to make this toolkit extendible for other file formats and I/O libraries.

hpc_campaign manager

Script to create/update/delete a campaign archive.

This is the code to create a campaign archive of small files that can be shared with users and which contains references to the actual data in large files.

Example 1: add existing files on a resource

hpc_campaign manager myproject/mycampaign_001.aca create
hpc_campaign manager myproject/mycampaign_001.aca dataset file1.bp restart/restart.bp`

Example 2: add existing files on a resource and encrypt them with a keyfile hpc_campaign manager myproject/mycampaign_001.aca -k keyfile dataset file1.bp restart/restart.bp

Example 2: create a new campaign file and add datasets pointing to an S3 bucket hpc_campaign manager mys3campaigns/shot001.aca create dataset --hostname SERVERNAME --s3_bucket /example-bucket --s3_datetime "2024-10-22 10:20:15 -0400" file1.bp file2.bp file3.bp

Note: use SERVERNAME in the host configuration file to specify on a local resource to how to connect to it

hpc_campaign genkey

Script to generate a key file used to encrypt files added into a campaign archive.

Anyone with the keyfile can process a campaign archive but without it one can only list the content of the archive itself (list of files, host, folders and creation times).

The encoding key in the file is stored as plain text by default. Optionally, a password can be used to encrypt the key itself. Only those who has the keyfile and know the password, can process the campaign archive.

Example 1: Generate a keyfile. hpc_campaign genkey generate keyfile

Example 2: Generate a password-encrypted keyfile hpc_campaign genkey generate -p keyfile

Example 3: Verify a password encrypted keyfile. Asks for the password and decrypts the key in memory to verify the key/password combo is valid. hpc_campaign genkey verify keyfile

Example 4: Print keyfile information. It does not ask for password. hpc_campaign genkey info keyfile

hpc_campaign connector

SSH tunnel and port forwarding using paramiko.

This is a service that needs to be started on the local machine, so that ADIOS can ask for connections to remote hosts, as specified in the remote host configuration. Additionally, this scripts loads the available keyfiles, and serves to ADIOS on demand.

Example: hpc_campaign connector -c ~/.config/hpc-campaign/hosts.yaml -p 30000

Note that ADIOS currently looks for this service on fixed port 30000

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpc_campaign-0.6.0.tar.gz (53.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hpc_campaign-0.6.0-py3-none-any.whl (45.2 kB view details)

Uploaded Python 3

File details

Details for the file hpc_campaign-0.6.0.tar.gz.

File metadata

  • Download URL: hpc_campaign-0.6.0.tar.gz
  • Upload date:
  • Size: 53.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for hpc_campaign-0.6.0.tar.gz
Algorithm Hash digest
SHA256 29ffdaada39680f1555cbaf59b0ffae40f69ebba400900df5916390918b40539
MD5 3d50f4050abf8bc3c25bfc9dff74cdaf
BLAKE2b-256 e5b6e4bcf48c0a3e71b266f4c6be367e19a7a52af139e5408b6a71c8c067db1d

See more details on using hashes here.

File details

Details for the file hpc_campaign-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: hpc_campaign-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 45.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for hpc_campaign-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 589aac417a9119051292fccb605a940bea80a402a56511730867ec40353c4b84
MD5 60aa1a82d350e1aef8ac5b118fda6db4
BLAKE2b-256 0a1b3d963c2a3ad07ba5bc2675f293a237ea39dc3d97c11c3f5ff31af96a38c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page