Skip to main content

High-performance S3 file copy tool — concurrent, async, built in Rust

Project description

s3bolt

High-performance S3 file copy tool — concurrent, async, built in Rust.

CI Crates.io PyPI License: MIT

Copy S3 objects between buckets and prefixes at maximum throughput. Uses server-side copy, adaptive concurrency, and async I/O. Available as a Rust crate, a CLI tool, and a Python package.

Why s3bolt?

aws s3 cp s3bolt
Concurrency ~10 concurrent transfers Up to 1024 with adaptive AIMD
Large files (>5 GiB) Multipart upload (downloads data) Multipart server-side copy (zero data transfer)
Throttle handling Fixed retry Adaptive backoff (halves concurrency on 503)
Resume No Checkpoint file for resumable copies
Filtering Basic --exclude Glob, regex, size range, date range
Cross-account Single profile Separate source/dest profiles

Installation

CLI (Rust)

cargo install s3bolt

Python

pip install s3bolt

From source

git clone https://github.com/cykruss/s3bolt.git
cd s3bolt
cargo build --release
# Binary at target/release/s3bolt

Quick start

CLI

# Copy a single file
s3bolt s3://src-bucket/data/file.parquet s3://dst-bucket/data/file.parquet

# Recursive prefix copy
s3bolt -r s3://src-bucket/data/2024/ s3://dst-bucket/archive/2024/

# Sync (only copy new/changed objects)
s3bolt -r --sync s3://data-lake/raw/ s3://data-lake/curated/

# Filter by pattern
s3bolt -r --include "**/*.parquet" --exclude "_tmp/**" s3://src/ s3://dst/

# Cross-account with different AWS profiles
s3bolt -r --source-profile prod --dest-profile analytics s3://prod/ s3://analytics/

# High concurrency
s3bolt -r -j 512 s3://src/ s3://dst/

# Dry run (see what would be copied)
s3bolt -r --dry-run s3://src/ s3://dst/

# Resumable copy with checkpoint
s3bolt -r --checkpoint /tmp/copy.ckpt s3://src/ s3://dst/
# If interrupted, resume with:
s3bolt -r --checkpoint /tmp/copy.ckpt --resume s3://src/ s3://dst/

Python

from s3bolt import S3CopyEngine

engine = S3CopyEngine(source_profile="prod", dest_profile="analytics")

result = engine.copy(
    "s3://src-bucket/data/",
    "s3://dst-bucket/data/",
    recursive=True,
    include=["**/*.parquet"],
    max_concurrent=512,
)

print(f"Copied {result['copied_objects']} objects ({result['copied_bytes']} bytes)")
print(f"Skipped {result['skipped_objects']} | Failed {result['failed_objects']}")
print(f"Duration: {result['duration_secs']:.1f}s")

Rust

use std::sync::Arc;
use s3bolt::config::{CopyConfig, ConcurrencyConfig, FilterConfig};
use s3bolt::engine::orchestrator;
use s3bolt::progress::reporter::ProgressState;
use s3bolt::types::S3Uri;

#[tokio::main]
async fn main() -> s3bolt::error::Result<()> {
    let config = CopyConfig {
        source: S3Uri::parse("s3://src-bucket/prefix/")?,
        destination: S3Uri::parse("s3://dst-bucket/prefix/")?,
        recursive: true,
        sync_mode: false,
        dry_run: false,
        verify: false,
        filters: FilterConfig::default(),
        concurrency: ConcurrencyConfig::default(),
        checkpoint_path: None,
        resume: false,
        storage_class: None,
        sse: None,
        preserve_metadata: false,
        source_profile: None,
        dest_profile: None,
    };

    let progress = Arc::new(ProgressState::default());
    let manifest = orchestrator::run(config, progress).await?;
    println!("Copied {} objects", manifest.copied_objects);
    Ok(())
}

Architecture

ListObjectsV2 (async paginator stream)
       │
       ▼
[bounded channel, cap=10,000]  ← backpressure: listing pauses when full
       │
       ▼
Filter stage (glob, regex, size, date)
       │
       ▼
[tokio::sync::Semaphore, permits=N]  ← adaptive concurrency (AIMD)
       │
       ▼
Worker tasks (spawn per object)
  ├── CopyObject (≤ 5 GiB) ─── server-side, zero data transfer
  └── UploadPartCopy (> 5 GiB) ─ parallel multipart, server-side
       │
       ▼
Progress reporter + checkpoint writer

Performance design

  • Tokio async runtime — tens of thousands of concurrent I/O tasks on a small thread pool. No OS thread overhead.
  • Server-side copyCopyObject and UploadPartCopy move data within S3's network. The client only sends metadata requests (~50-200ms latency each).
  • Adaptive concurrency (AIMD) — starts at the configured limit (default 256), ramps up on sustained success, halves on S3 503 SlowDown responses. Respects S3's 3,500 PUT/5,500 GET per-prefix rate limits automatically.
  • Bounded backpressure — a 10,000-item channel between the lister and copy workers. If workers fall behind, listing pauses. Memory stays bounded (~2-3 MiB for the queue).
  • Multipart for large files — objects > 5 GiB are split into 256 MiB parts, each copied server-side in parallel. Automatic cleanup (abort) on failure.
  • Zero unnecessary data transfer — data never flows through the client for same-region copies. Pure metadata orchestration.

CLI reference

s3bolt [OPTIONS] <SOURCE> <DESTINATION>

Arguments:
  <SOURCE>        Source S3 URI (s3://bucket/key or s3://bucket/prefix/)
  <DESTINATION>   Destination S3 URI

Copy options:
  -r, --recursive              Recursively copy all objects under prefix
      --sync                   Only copy new/changed objects
      --dry-run                List objects without copying
      --verify                 Verify ETag after copy
      --storage-class <CLASS>  Override storage class

Filtering:
      --include <GLOB>         Include keys matching glob (repeatable)
      --exclude <GLOB>         Exclude keys matching glob (repeatable)
      --key-regex <REGEX>      Include keys matching regex
      --min-size <BYTES>       Minimum object size
      --max-size <BYTES>       Maximum object size

Concurrency:
  -j, --concurrency <N>       Max concurrent copies [default: 256]
      --no-adaptive            Disable adaptive concurrency

Resume:
      --checkpoint <FILE>      Checkpoint file for resume support
      --resume                 Resume from existing checkpoint

Credentials:
      --source-profile <NAME>  AWS profile for source
      --dest-profile <NAME>    AWS profile for destination

Output:
  -v, --verbose                Debug logging
  -q, --quiet                  Errors only

Prerequisites

  • AWS credentials configured via any standard method (env vars, ~/.aws/credentials, IAM role, SSO)
  • The executing role must have s3:GetObject + s3:ListBucket on source and s3:PutObject on destination
  • For multipart copies: s3:AbortMultipartUpload on destination

Minimal IAM policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::source-bucket",
        "arn:aws:s3:::source-bucket/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject", "s3:AbortMultipartUpload"],
      "Resource": [
        "arn:aws:s3:::dest-bucket",
        "arn:aws:s3:::dest-bucket/*"
      ]
    }
  ]
}

Development

# Clone
git clone https://github.com/cykruss/s3bolt.git
cd s3bolt

# Rust tests
cargo test --no-default-features

# Clippy
cargo clippy --no-default-features --lib -- -D warnings

# Build Python package (dev mode)
python -m venv .venv
source .venv/bin/activate
pip install maturin pytest
maturin develop --release
pytest tests/ -v

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3bolt-0.2.0.tar.gz (54.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.0 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

s3bolt-0.2.0-cp310-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.10+Windows x86-64

s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl (5.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl (4.5 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file s3bolt-0.2.0.tar.gz.

File metadata

  • Download URL: s3bolt-0.2.0.tar.gz
  • Upload date:
  • Size: 54.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for s3bolt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d40b9c609072fc9a10bc50a52a86f0833291ae9facbd75b94b7ec826e39678b0
MD5 313f5078e800ba4ba24ef78c8a65bf80
BLAKE2b-256 a84b619b52ba14408add76323f82e604a24f004857e38c3798034a84ad067aa5

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3bolt-0.2.0.tar.gz:

Publisher: CI.yml on cykruss/s3bolt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 56b10443989e51b09a7be9c7d168718bef72f3f02c6ea0268afb9f12fd22e74a
MD5 1f81c630d2e4d8e4099635520c5733c0
BLAKE2b-256 24a9602a8714c7cbbfe466451314fcbb93b1884723649d108eeda2fd967c518d

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: CI.yml on cykruss/s3bolt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file s3bolt-0.2.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: s3bolt-0.2.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for s3bolt-0.2.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 26513a07bffd90b4132693e56a2fbb136eca8b99889098a1ccaf8dda99c5a0ab
MD5 ec4f9f7107b3e49ecbacc5584e509ae4
BLAKE2b-256 a0899ec085e9c3aeba76e60e4d1331ad86bb248d2bc54f0310680835854ae38d

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-win_amd64.whl:

Publisher: CI.yml on cykruss/s3bolt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d8978c3f4098f2cef21ba147d09d4c85248241623b9b25fa873573cdaa422e53
MD5 437479dcbb63449a3a237a82d56f0b12
BLAKE2b-256 ca581fc08e8e61877010bb63cddd605fa1bec9a17fc70da19764debe8f0fcb74

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: CI.yml on cykruss/s3bolt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4f9f4a5eb6105a1b20c137d5a5cd84ec5e960727bfccfd92456ee86e96193596
MD5 ffc2855c7a479f0fee106f3c7e665cf6
BLAKE2b-256 27a190cf59f7f0bad8008303dc2aa15bc106f1f2414de233bd72900d496d5d1a

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: CI.yml on cykruss/s3bolt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 04e8dc43859aeb27f6ec86396fa93cd74639ea20f0b68b9631b3efff2a9256c3
MD5 8fc680c9868d97ace7e922e355810c7d
BLAKE2b-256 f88cd329ad52a71d948d63c6c2a5ce0b180289b6489ad9e8f6b105e5ebe0903c

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: CI.yml on cykruss/s3bolt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page