High-performance S3 file copy tool — concurrent, async, built in Rust
Project description
s3bolt
High-performance S3 file copy tool — concurrent, async, built in Rust.
Copy S3 objects between buckets and prefixes at maximum throughput. Uses server-side copy, adaptive concurrency, and async I/O. Available as a Rust crate, a CLI tool, and a Python package.
Why s3bolt?
aws s3 cp |
s3bolt | |
|---|---|---|
| Concurrency | ~10 concurrent transfers | Up to 1024 with adaptive AIMD |
| Large files (>5 GiB) | Multipart upload (downloads data) | Multipart server-side copy (zero data transfer) |
| Throttle handling | Fixed retry | Adaptive backoff (halves concurrency on 503) |
| Resume | No | Checkpoint file for resumable copies |
| Filtering | Basic --exclude |
Glob, regex, size range, date range |
| Cross-account | Single profile | Separate source/dest profiles |
Installation
CLI (Rust)
cargo install s3bolt
Python
pip install s3bolt
From source
git clone https://github.com/cykruss/s3bolt.git
cd s3bolt
cargo build --release
# Binary at target/release/s3bolt
Quick start
CLI
# Copy a single file
s3bolt s3://src-bucket/data/file.parquet s3://dst-bucket/data/file.parquet
# Recursive prefix copy
s3bolt -r s3://src-bucket/data/2024/ s3://dst-bucket/archive/2024/
# Sync (only copy new/changed objects)
s3bolt -r --sync s3://data-lake/raw/ s3://data-lake/curated/
# Filter by pattern
s3bolt -r --include "**/*.parquet" --exclude "_tmp/**" s3://src/ s3://dst/
# Cross-account with different AWS profiles
s3bolt -r --source-profile prod --dest-profile analytics s3://prod/ s3://analytics/
# High concurrency
s3bolt -r -j 512 s3://src/ s3://dst/
# Dry run (see what would be copied)
s3bolt -r --dry-run s3://src/ s3://dst/
# Resumable copy with checkpoint
s3bolt -r --checkpoint /tmp/copy.ckpt s3://src/ s3://dst/
# If interrupted, resume with:
s3bolt -r --checkpoint /tmp/copy.ckpt --resume s3://src/ s3://dst/
Python
from s3bolt import S3CopyEngine
engine = S3CopyEngine(source_profile="prod", dest_profile="analytics")
result = engine.copy(
"s3://src-bucket/data/",
"s3://dst-bucket/data/",
recursive=True,
include=["**/*.parquet"],
max_concurrent=512,
)
print(f"Copied {result['copied_objects']} objects ({result['copied_bytes']} bytes)")
print(f"Skipped {result['skipped_objects']} | Failed {result['failed_objects']}")
print(f"Duration: {result['duration_secs']:.1f}s")
Rust
use std::sync::Arc;
use s3bolt::config::{CopyConfig, ConcurrencyConfig, FilterConfig};
use s3bolt::engine::orchestrator;
use s3bolt::progress::reporter::ProgressState;
use s3bolt::types::S3Uri;
#[tokio::main]
async fn main() -> s3bolt::error::Result<()> {
let config = CopyConfig {
source: S3Uri::parse("s3://src-bucket/prefix/")?,
destination: S3Uri::parse("s3://dst-bucket/prefix/")?,
recursive: true,
sync_mode: false,
dry_run: false,
verify: false,
filters: FilterConfig::default(),
concurrency: ConcurrencyConfig::default(),
checkpoint_path: None,
resume: false,
storage_class: None,
sse: None,
preserve_metadata: false,
source_profile: None,
dest_profile: None,
};
let progress = Arc::new(ProgressState::default());
let manifest = orchestrator::run(config, progress).await?;
println!("Copied {} objects", manifest.copied_objects);
Ok(())
}
Architecture
ListObjectsV2 (async paginator stream)
│
▼
[bounded channel, cap=10,000] ← backpressure: listing pauses when full
│
▼
Filter stage (glob, regex, size, date)
│
▼
[tokio::sync::Semaphore, permits=N] ← adaptive concurrency (AIMD)
│
▼
Worker tasks (spawn per object)
├── CopyObject (≤ 5 GiB) ─── server-side, zero data transfer
└── UploadPartCopy (> 5 GiB) ─ parallel multipart, server-side
│
▼
Progress reporter + checkpoint writer
Performance design
- Tokio async runtime — tens of thousands of concurrent I/O tasks on a small thread pool. No OS thread overhead.
- Server-side copy —
CopyObjectandUploadPartCopymove data within S3's network. The client only sends metadata requests (~50-200ms latency each). - Adaptive concurrency (AIMD) — starts at the configured limit (default 256), ramps up on sustained success, halves on S3 503 SlowDown responses. Respects S3's 3,500 PUT/5,500 GET per-prefix rate limits automatically.
- Bounded backpressure — a 10,000-item channel between the lister and copy workers. If workers fall behind, listing pauses. Memory stays bounded (~2-3 MiB for the queue).
- Multipart for large files — objects > 5 GiB are split into 256 MiB parts, each copied server-side in parallel. Automatic cleanup (abort) on failure.
- Zero unnecessary data transfer — data never flows through the client for same-region copies. Pure metadata orchestration.
CLI reference
s3bolt [OPTIONS] <SOURCE> <DESTINATION>
Arguments:
<SOURCE> Source S3 URI (s3://bucket/key or s3://bucket/prefix/)
<DESTINATION> Destination S3 URI
Copy options:
-r, --recursive Recursively copy all objects under prefix
--sync Only copy new/changed objects
--dry-run List objects without copying
--verify Verify ETag after copy
--storage-class <CLASS> Override storage class
Filtering:
--include <GLOB> Include keys matching glob (repeatable)
--exclude <GLOB> Exclude keys matching glob (repeatable)
--key-regex <REGEX> Include keys matching regex
--min-size <BYTES> Minimum object size
--max-size <BYTES> Maximum object size
Concurrency:
-j, --concurrency <N> Max concurrent copies [default: 256]
--no-adaptive Disable adaptive concurrency
Resume:
--checkpoint <FILE> Checkpoint file for resume support
--resume Resume from existing checkpoint
Credentials:
--source-profile <NAME> AWS profile for source
--dest-profile <NAME> AWS profile for destination
Output:
-v, --verbose Debug logging
-q, --quiet Errors only
Prerequisites
- AWS credentials configured via any standard method (env vars,
~/.aws/credentials, IAM role, SSO) - The executing role must have
s3:GetObject+s3:ListBucketon source ands3:PutObjecton destination - For multipart copies:
s3:AbortMultipartUploadon destination
Minimal IAM policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::source-bucket",
"arn:aws:s3:::source-bucket/*"
]
},
{
"Effect": "Allow",
"Action": ["s3:PutObject", "s3:AbortMultipartUpload"],
"Resource": [
"arn:aws:s3:::dest-bucket",
"arn:aws:s3:::dest-bucket/*"
]
}
]
}
Development
# Clone
git clone https://github.com/cykruss/s3bolt.git
cd s3bolt
# Rust tests
cargo test --no-default-features
# Clippy
cargo clippy --no-default-features --lib -- -D warnings
# Build Python package (dev mode)
python -m venv .venv
source .venv/bin/activate
pip install maturin pytest
maturin develop --release
pytest tests/ -v
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s3bolt-0.2.0.tar.gz.
File metadata
- Download URL: s3bolt-0.2.0.tar.gz
- Upload date:
- Size: 54.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d40b9c609072fc9a10bc50a52a86f0833291ae9facbd75b94b7ec826e39678b0
|
|
| MD5 |
313f5078e800ba4ba24ef78c8a65bf80
|
|
| BLAKE2b-256 |
a84b619b52ba14408add76323f82e604a24f004857e38c3798034a84ad067aa5
|
Provenance
The following attestation bundles were made for s3bolt-0.2.0.tar.gz:
Publisher:
CI.yml on cykruss/s3bolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s3bolt-0.2.0.tar.gz -
Subject digest:
d40b9c609072fc9a10bc50a52a86f0833291ae9facbd75b94b7ec826e39678b0 - Sigstore transparency entry: 1005333422
- Sigstore integration time:
-
Permalink:
cykruss/s3bolt@3f76219408ab561afb8da191be428abba3dc15c7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/cykruss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI.yml@3f76219408ab561afb8da191be428abba3dc15c7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 5.0 MB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56b10443989e51b09a7be9c7d168718bef72f3f02c6ea0268afb9f12fd22e74a
|
|
| MD5 |
1f81c630d2e4d8e4099635520c5733c0
|
|
| BLAKE2b-256 |
24a9602a8714c7cbbfe466451314fcbb93b1884723649d108eeda2fd967c518d
|
Provenance
The following attestation bundles were made for s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
CI.yml on cykruss/s3bolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s3bolt-0.2.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
56b10443989e51b09a7be9c7d168718bef72f3f02c6ea0268afb9f12fd22e74a - Sigstore transparency entry: 1005333459
- Sigstore integration time:
-
Permalink:
cykruss/s3bolt@3f76219408ab561afb8da191be428abba3dc15c7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/cykruss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI.yml@3f76219408ab561afb8da191be428abba3dc15c7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file s3bolt-0.2.0-cp310-abi3-win_amd64.whl.
File metadata
- Download URL: s3bolt-0.2.0-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 4.4 MB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26513a07bffd90b4132693e56a2fbb136eca8b99889098a1ccaf8dda99c5a0ab
|
|
| MD5 |
ec4f9f7107b3e49ecbacc5584e509ae4
|
|
| BLAKE2b-256 |
a0899ec085e9c3aeba76e60e4d1331ad86bb248d2bc54f0310680835854ae38d
|
Provenance
The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-win_amd64.whl:
Publisher:
CI.yml on cykruss/s3bolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s3bolt-0.2.0-cp310-abi3-win_amd64.whl -
Subject digest:
26513a07bffd90b4132693e56a2fbb136eca8b99889098a1ccaf8dda99c5a0ab - Sigstore transparency entry: 1005333483
- Sigstore integration time:
-
Permalink:
cykruss/s3bolt@3f76219408ab561afb8da191be428abba3dc15c7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/cykruss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI.yml@3f76219408ab561afb8da191be428abba3dc15c7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 5.1 MB
- Tags: CPython 3.10+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d8978c3f4098f2cef21ba147d09d4c85248241623b9b25fa873573cdaa422e53
|
|
| MD5 |
437479dcbb63449a3a237a82d56f0b12
|
|
| BLAKE2b-256 |
ca581fc08e8e61877010bb63cddd605fa1bec9a17fc70da19764debe8f0fcb74
|
Provenance
The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl:
Publisher:
CI.yml on cykruss/s3bolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s3bolt-0.2.0-cp310-abi3-manylinux_2_28_aarch64.whl -
Subject digest:
d8978c3f4098f2cef21ba147d09d4c85248241623b9b25fa873573cdaa422e53 - Sigstore transparency entry: 1005333432
- Sigstore integration time:
-
Permalink:
cykruss/s3bolt@3f76219408ab561afb8da191be428abba3dc15c7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/cykruss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI.yml@3f76219408ab561afb8da191be428abba3dc15c7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 5.0 MB
- Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f9f4a5eb6105a1b20c137d5a5cd84ec5e960727bfccfd92456ee86e96193596
|
|
| MD5 |
ffc2855c7a479f0fee106f3c7e665cf6
|
|
| BLAKE2b-256 |
27a190cf59f7f0bad8008303dc2aa15bc106f1f2414de233bd72900d496d5d1a
|
Provenance
The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
CI.yml on cykruss/s3bolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s3bolt-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
4f9f4a5eb6105a1b20c137d5a5cd84ec5e960727bfccfd92456ee86e96193596 - Sigstore transparency entry: 1005333442
- Sigstore integration time:
-
Permalink:
cykruss/s3bolt@3f76219408ab561afb8da191be428abba3dc15c7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/cykruss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI.yml@3f76219408ab561afb8da191be428abba3dc15c7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 4.5 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04e8dc43859aeb27f6ec86396fa93cd74639ea20f0b68b9631b3efff2a9256c3
|
|
| MD5 |
8fc680c9868d97ace7e922e355810c7d
|
|
| BLAKE2b-256 |
f88cd329ad52a71d948d63c6c2a5ce0b180289b6489ad9e8f6b105e5ebe0903c
|
Provenance
The following attestation bundles were made for s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl:
Publisher:
CI.yml on cykruss/s3bolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s3bolt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl -
Subject digest:
04e8dc43859aeb27f6ec86396fa93cd74639ea20f0b68b9631b3efff2a9256c3 - Sigstore transparency entry: 1005333502
- Sigstore integration time:
-
Permalink:
cykruss/s3bolt@3f76219408ab561afb8da191be428abba3dc15c7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/cykruss
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI.yml@3f76219408ab561afb8da191be428abba3dc15c7 -
Trigger Event:
push
-
Statement type: