Skip to main content

Intelligent S3 synchronization for HCA Atlas data

Project description

HCA Smart-Sync

Intelligent S3 data synchronization for HCA Atlas source datasets and integrated objects.

Features

  • Smart synchronization - Only uploads files with changed checksums
  • SHA256 verification - Research-grade data integrity
  • Manifest generation - Automatic upload manifests
  • Progress tracking - Real-time upload progress
  • Environment support - Separate prod and dev buckets
  • Dry run mode - Preview changes before uploading

Requirements

  • Python 3.10+
  • AWS CLI configured with appropriate profiles
  • S3 access to HCA Atlas buckets

Installation

Install using pipx (recommended):

pipx install hca-smart-sync

Quick Start

First Time Setup

Configure default settings (optional but recommended):

hca-smart-sync config init
# Enter your default AWS profile and atlas name

Basic Usage

# Sync source datasets for gut atlas
hca-smart-sync sync gut-v1 source-datasets --profile my-profile

# Sync integrated objects for immune atlas
hca-smart-sync sync immune-v1 integrated-objects --profile my-profile

# Dry run to preview changes
hca-smart-sync sync gut-v1 source-datasets --profile my-profile --dry-run

Using Config Defaults

Once you've configured defaults, you can omit the atlas:

# File type only (uses config for atlas and profile)
hca-smart-sync sync source-datasets

# Or integrated objects
hca-smart-sync sync integrated-objects

# Override config atlas, use config profile
hca-smart-sync sync immune-v1 source-datasets

Flexible Argument Order

The tool accepts arguments in two ways:

# Atlas first, then file type
hca-smart-sync sync gut-v1 source-datasets

# File type first (uses config for atlas)
hca-smart-sync sync source-datasets

Note: File type is always required - you must specify either source-datasets or integrated-objects.

Available Options

  • --profile TEXT - AWS profile to use (uses config default if not specified)
  • --dry-run - Preview changes without uploading
  • --verbose - Show detailed output
  • --force - Force upload even if file content is unchanged
  • --local-path TEXT - Custom local directory (defaults to current directory)

Getting Help

# Show all available commands
hca-smart-sync --help

# Show sync command options
hca-smart-sync sync --help

# Show version
hca-smart-sync --version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hca_smart_sync-0.4.0.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hca_smart_sync-0.4.0-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file hca_smart_sync-0.4.0.tar.gz.

File metadata

  • Download URL: hca_smart_sync-0.4.0.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for hca_smart_sync-0.4.0.tar.gz
Algorithm Hash digest
SHA256 bd8ef4f039b9595cf4bc88aaa6fce78ee1261c45fba9acafa95fdc157f8383d7
MD5 f9e608d3be4f93a59884a58dc41a64bb
BLAKE2b-256 32b2a386bee03a9e59a877d37cf4609e18567cf98dcfbe2c300545741692da7c

See more details on using hashes here.

Provenance

The following attestation bundles were made for hca_smart_sync-0.4.0.tar.gz:

Publisher: publish-smart-sync.yml on clevercanary/hca-ingest-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hca_smart_sync-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: hca_smart_sync-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 25.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for hca_smart_sync-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 98a86208ba6d9809a64a28e6e6020792b33f29f7e07fc938e7c80493d4fee19f
MD5 aa7424fa81c2f4517506c5ad5a051668
BLAKE2b-256 22c6b277c9933726e98d2c8811044a1a47487ebfe74e1534a12c357c779d0856

See more details on using hashes here.

Provenance

The following attestation bundles were made for hca_smart_sync-0.4.0-py3-none-any.whl:

Publisher: publish-smart-sync.yml on clevercanary/hca-ingest-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page