Skip to main content

Campaign management scripts for remote data access

Project description

License Documentation

Documentation

Documentation is hosted at readthedocs. See Installation for installation instructions. It can be installed by pip3 install hpc-campaign but needs to be configured before first use.

hpc-campaign

HPC Campaign Management is a set of Python scripts for creating small metadata files about large datasets in one or more locations, which can be shared among project users, and which refer back to the real data.

A Campaign Archive file can contain

  • metadata of ADIOS2 BP5 datasets
  • metadata of HDF5 files
  • images (stored inside the campaign archive, or reference to a remote image)
  • text files (stored compressed in campaign archive, or reference to a remote text file)

Each dataset can have multiple replicas, in multiple host/directory locations. Datasets are assigned a unique identifier at first insert and then all history of replicas can be tracked by the unqiue identifier. Special locations can be designated as archives, meaning they are not directly accessible but first have to be restored to some location.

Campaign management requires an I/O solution to support

  • extracting metadata from datasets
  • handling the metadata file (.ACA) as a supported file format
  • understand that the data is remote and support remote data access for actual data operations.

Currently, this toolkit is being developed for the ADIOS I/O framework <https://adios2.readthedocs.io/>_, however, the intention is to make this toolkit extendible for other file formats and I/O libraries.

hpc_campaign manager

Script to create/update/delete a campaign archive.

This is the code to create a campaign archive of small files that can be shared with users and which contains references to the actual data in large files.

Example 1: add existing files on a resource

hpc_campaign manager myproject/mycampaign_001.aca create
hpc_campaign manager myproject/mycampaign_001.aca dataset file1.bp restart/restart.bp`

Example 2: add existing files on a resource and encrypt them with a keyfile hpc_campaign manager myproject/mycampaign_001.aca -k keyfile dataset file1.bp restart/restart.bp

Example 2: create a new campaign file and add datasets pointing to an S3 bucket hpc_campaign manager mys3campaigns/shot001.aca create dataset --hostname SERVERNAME --s3_bucket /example-bucket --s3_datetime "2024-10-22 10:20:15 -0400" file1.bp file2.bp file3.bp

Note: use SERVERNAME in the host configuration file to specify on a local resource to how to connect to it

hpc_campaign genkey

Script to generate a key file used to encrypt files added into a campaign archive.

Anyone with the keyfile can process a campaign archive but without it one can only list the content of the archive itself (list of files, host, folders and creation times).

The encoding key in the file is stored as plain text by default. Optionally, a password can be used to encrypt the key itself. Only those who has the keyfile and know the password, can process the campaign archive.

Example 1: Generate a keyfile. hpc_campaign genkey generate keyfile

Example 2: Generate a password-encrypted keyfile hpc_campaign genkey generate -p keyfile

Example 3: Verify a password encrypted keyfile. Asks for the password and decrypts the key in memory to verify the key/password combo is valid. hpc_campaign genkey verify keyfile

Example 4: Print keyfile information. It does not ask for password. hpc_campaign genkey info keyfile

hpc_campaign connector

SSH tunnel and port forwarding using paramiko.

This is a service that needs to be started on the local machine, so that ADIOS can ask for connections to remote hosts, as specified in the remote host configuration. Additionally, this scripts loads the available keyfiles, and serves to ADIOS on demand.

Example: hpc_campaign connector -c ~/.config/hpc-campaign/hosts.yaml -p 30000

Note that ADIOS currently looks for this service on fixed port 30000

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpc_campaign-0.7.0.tar.gz (128.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hpc_campaign-0.7.0-py3-none-any.whl (56.1 kB view details)

Uploaded Python 3

File details

Details for the file hpc_campaign-0.7.0.tar.gz.

File metadata

  • Download URL: hpc_campaign-0.7.0.tar.gz
  • Upload date:
  • Size: 128.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for hpc_campaign-0.7.0.tar.gz
Algorithm Hash digest
SHA256 6b85f184f545d05a2e298024741c07712a555a156ae13142013d9f81fb093846
MD5 768729853f928ed0f922a953910169a5
BLAKE2b-256 ce48cfca2b1649eb24b9b7a1c0e491011a1c19b142f34d03ddb286a2de660137

See more details on using hashes here.

File details

Details for the file hpc_campaign-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: hpc_campaign-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 56.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for hpc_campaign-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ab4f74c36a45441e9f23821705bb72f954e56b60ec02f82a33dd871eb77a8bc
MD5 e09540379f5b92563828628edc879c7f
BLAKE2b-256 2ec0a1857b8a55b96e1e72e6333a63fd68c2884e1d554817e2c7a6609cc70e39

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page