Skip to main content

Cross-platform, multithreaded S3 file synchronization daemon

Project description

s3syncy

Tests PyPI version License: MIT

Cross-platform, multithreaded S3 file synchronisation daemon.

Features

  • Continuous sync — watches directories for changes in real-time (via watchdog) and runs periodic full scans as a safety net.
  • Daemon controls — start in background and control with stop, pause, resume, reload, daemon-status.
  • Multithreaded — configurable thread pool for parallel uploads/downloads.
  • Bandwidth throttling — token-bucket rate limiter (upload & download independently).
  • Resource-friendly — chunked streaming (no full-file buffering), optional soft memory cap, bounded thread pool.
  • Configurable — single config.yaml controls everything (S3 target, threads, bandwidth, conflict strategy, integrity, logging).
  • Gitignore-style exclusions.syncignore file uses the same pattern syntax as .gitignore.
  • Auto-reload — config and exclusion files are reloaded automatically on change.
  • Searchable local index — SQLite metadata database with full-text search on file paths and folder-prefix listing.
  • Conflict resolutionlocal_wins, remote_wins, newest_wins, or skip — with optional .bak backup before overwriting.
  • Remote delete self-heal — if an object is deleted directly from S3 but still exists locally, daemon restores it on the next scan.
  • Integrity checks — post-upload hash verification (MD5 via S3 ETag, or SHA256). Configurable reaction: warn, retry, or delete_remote.
  • Cross-platform — macOS, Linux, Windows (Python 3.10+).

Quick Start

# Install from PyPI
pip install s3syncy

# Initialize configuration
s3syncy init

# Edit config.yaml with your S3 bucket and sync directories
# Then run:
s3syncy start -c config.yaml --background

# Check status
s3syncy status -c config.yaml

CLI Commands

Command Description
s3syncy start -c config.yaml Start the sync daemon
s3syncy start -c config.yaml --background Start daemon in background
s3syncy stop -c config.yaml Stop background daemon
s3syncy pause -c config.yaml Pause syncing (daemon stays alive)
s3syncy resume -c config.yaml Resume syncing after pause
s3syncy reload -c config.yaml Reload config + exclusions immediately
s3syncy daemon-status -c config.yaml Show daemon PID/running/state info
s3syncy search "report" -c config.yaml Search the index for files matching "report"
s3syncy ls "photos/2024" -c config.yaml List synced files under a path prefix
s3syncy pull "docs/file.pdf" ./local.pdf -c config.yaml Download a single file from S3
s3syncy status -c config.yaml Show index statistics (total files, synced count, total size)
s3syncy init Create starter config.yaml and .syncignore

Configuration

See config.yaml for full documentation. Key settings:

sync_dirs:
  - ~/Documents/sync
  - ~/Desktop/uploads

s3:
  bucket: "my-bucket"
  prefix: "backups"
  region: "us-east-1"

threads: 4
scan_interval_seconds: 300

bandwidth:
  upload_limit_mbps: 10    # 0 = unlimited
  download_limit_mbps: 0

conflict:
  strategy: "newest_wins"  # local_wins | remote_wins | newest_wins | skip
  backup_before_overwrite: true

integrity:
  enabled: true
  algorithm: "md5"         # md5 | sha256
  on_failure: "warn"       # warn | retry | delete_remote

When multiple sync_dirs are configured, one daemon handles all of them.
S3 keys are namespaced per root (for example Documents/file.txt, uploads-2/file.txt) to avoid collisions.

.syncignore

Works exactly like .gitignore:

# OS junk
.DS_Store
Thumbs.db

# Build artefacts
node_modules/
__pycache__/
*.pyc

# Secrets
.env
*.pem

Signals (Unix)

  • SIGINT / SIGTERM — graceful shutdown (finish in-flight transfers, close index).
  • SIGHUP — reload config and exclusions.
  • SIGUSR1 — pause syncing.
  • SIGUSR2 — resume syncing.

Architecture

┌─────────────┐     events      ┌─────────────┐    ThreadPool    ┌──────────┐
│  watchdog   │ ──────────────▸ │   watcher   │ ──────────────▸ │  engine  │
│  (OS-level) │   debounced     │  (handler)  │   submit tasks   │ (upload/ │
└─────────────┘                 └──────┬──────┘                  │ download)│
                                       │                         └────┬─────┘
                          periodic     │                              │
                          full scan    ▼                              ▼
                                ┌─────────────┐              ┌──────────────┐
                                │   daemon    │              │   S3 (boto3) │
                                │ (main loop) │              │  + throttle  │
                                └─────────────┘              │  + integrity │
                                       │                     └──────────────┘
                                       ▼
                                ┌─────────────┐
                                │   SQLite    │
                                │   index     │
                                └─────────────┘

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3syncy-0.1.0.tar.gz (39.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3syncy-0.1.0-py3-none-any.whl (26.4 kB view details)

Uploaded Python 3

File details

Details for the file s3syncy-0.1.0.tar.gz.

File metadata

  • Download URL: s3syncy-0.1.0.tar.gz
  • Upload date:
  • Size: 39.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for s3syncy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 055842d94dbd4888137fc15563667d9587b4fbce7f48dd5c74657a6d5d2cd6d4
MD5 f426096efcb24a66f218abd01478b170
BLAKE2b-256 7718f0e78e87dbc01beec383502534488d43ebe3cc9792c6f1b962fa594d5aa2

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3syncy-0.1.0.tar.gz:

Publisher: publish.yml on mtahle/s3syncy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file s3syncy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: s3syncy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for s3syncy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fb2b6c2277567f78194cc3bc5bb73d519917a86bb20fb700bcbc8b9e37f240e3
MD5 eae99ccedcbb96101ca462b1bd911194
BLAKE2b-256 d2746cfa405ca74a605c19c0b0004865019bb37b122e3570c5e4581d7e87b9c1

See more details on using hashes here.

Provenance

The following attestation bundles were made for s3syncy-0.1.0-py3-none-any.whl:

Publisher: publish.yml on mtahle/s3syncy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page