Skip to main content

Intelligent S3 synchronization for HCA Atlas data

Project description

HCA Smart-Sync

Intelligent S3 data synchronization for HCA Atlas source datasets and integrated objects.

Features

  • Smart synchronization - Only uploads files with changed checksums
  • SHA256 verification - Research-grade data integrity
  • Manifest generation - Automatic upload manifests
  • Progress tracking - Real-time upload progress
  • Environment support - Separate prod and dev buckets
  • Dry run mode - Preview changes before uploading

Requirements

  • Python 3.10+
  • AWS CLI configured with appropriate profiles
  • S3 access to HCA Atlas buckets

Installation

Install using pipx (recommended):

pipx install hca-smart-sync

Quick Start

First Time Setup

Configure default settings (optional but recommended):

hca-smart-sync config init
# Enter your default AWS profile and atlas name

Basic Usage

# Sync source datasets for gut atlas
hca-smart-sync sync gut-v1 source-datasets --profile my-profile

# Sync integrated objects for immune atlas
hca-smart-sync sync immune-v1 integrated-objects --profile my-profile

# Dry run to preview changes
hca-smart-sync sync gut-v1 source-datasets --profile my-profile --dry-run

Using Config Defaults

Once you've configured defaults, you can omit the atlas:

# File type only (uses config for atlas and profile)
hca-smart-sync sync source-datasets

# Or integrated objects
hca-smart-sync sync integrated-objects

# Override config atlas, use config profile
hca-smart-sync sync immune-v1 source-datasets

Flexible Argument Order

The tool accepts arguments in two ways:

# Atlas first, then file type
hca-smart-sync sync gut-v1 source-datasets

# File type first (uses config for atlas)
hca-smart-sync sync source-datasets

Note: File type is always required - you must specify either source-datasets or integrated-objects.

Available Options

  • --profile TEXT - AWS profile to use (uses config default if not specified)
  • --dry-run - Preview changes without uploading
  • --verbose - Show detailed output
  • --force - Force upload even if file content is unchanged
  • --local-path TEXT - Custom local directory (defaults to current directory)

Getting Help

# Show all available commands
hca-smart-sync --help

# Show sync command options
hca-smart-sync sync --help

# Show version
hca-smart-sync --version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hca_smart_sync-0.3.0.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hca_smart_sync-0.3.0-py3-none-any.whl (24.6 kB view details)

Uploaded Python 3

File details

Details for the file hca_smart_sync-0.3.0.tar.gz.

File metadata

  • Download URL: hca_smart_sync-0.3.0.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for hca_smart_sync-0.3.0.tar.gz
Algorithm Hash digest
SHA256 79b8b5b66724582d7d3041419100dcdd1c242fb45335d84f91c5964d4e05162d
MD5 c3ac97b63c230e145adede6f041f5a01
BLAKE2b-256 0d8aed6cb6ec603e208a5100bc287f96dc8817d36acf187f0f2b80d642812304

See more details on using hashes here.

Provenance

The following attestation bundles were made for hca_smart_sync-0.3.0.tar.gz:

Publisher: publish-smart-sync.yml on clevercanary/hca-ingest-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hca_smart_sync-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: hca_smart_sync-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 24.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for hca_smart_sync-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f6d99b2d18259740bce795094fb08f840f0637bbc33c4c0f824a208e1789eb0a
MD5 922782598c722b9b450d174d6f4de4fe
BLAKE2b-256 7f75e0df864d2041626782f24db94192ebb2781e7eb20c42cd4d43f749bf95c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for hca_smart_sync-0.3.0-py3-none-any.whl:

Publisher: publish-smart-sync.yml on clevercanary/hca-ingest-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page