Skip to main content

Ubuntu mirror synchronisation with global deduplication

Project description

Ubuntu/Debian mirror synchronisation with intelligent deduplication using hardlinks.

Features

  • Downloads from upstream and hardlinks duplicate files (same SHA256 hash) to save bandwidth and disk space
  • Supports multiple mirrors with global deduplication across all mirrors
  • Uses curl for duplicate files, rsync for unique files and metadata
  • Configurable via YAML files
  • Systemd timer support for automated synchronisation

Installation

Option 1: Debian/Ubuntu Package (Recommended)

Download the latest .deb package from GitHub Releases:

wget https://github.com/munger/mirror-dedupe/releases/download/v0.2.0/mirror-dedupe_0.2.0-1_all.deb
sudo dpkg -i mirror-dedupe_0.2.0-1_all.deb

This includes systemd integration, man pages, and proper package management.

Option 2: PyPI (All Linux Distributions)

pip install mirror-dedupe

Then install systemd files manually:

sudo ./install.sh --pip

Option 3: From Source

git clone https://github.com/munger/mirror-dedupe.git
cd mirror-dedupe
sudo ./install.sh

Configuration

Configuration files are located in /etc/mirror-dedupe/:

  • mirror-dedupe.conf - Global settings
  • repos-available/ - Available repository configurations
  • repos-enabled/ - Enabled repositories (symlinks to repos-available)

Adding a Repository

Use the scanner to auto-generate configuration:

mirror-dedupe-scan --name grafana --dest grafana https://apt.grafana.com

Then enable it:

cd /etc/mirror-dedupe/repos-enabled
ln -s ../repos-available/grafana.conf .

Pre-configured Repositories

The package includes pre-configured repositories:

  • ubuntu - Ubuntu main archive
  • ubuntu-ports - Ubuntu ARM/RISC-V ports
  • grafana - Grafana packages
  • influxdb - InfluxDB packages
  • docker - Docker packages
  • kubernetes - Kubernetes packages

Usage

# Sync all mirrors
mirror-dedupe

# Sync specific mirror
mirror-dedupe --mirror ubuntu

# Dry run
mirror-dedupe --dry-run

# Dedupe only (no sync)
mirror-dedupe --dedupe-only

Systemd Integration

If installed via Debian package, systemd is already configured. Otherwise:

sudo systemctl enable --now mirror-dedupe.timer
sudo systemctl status mirror-dedupe.timer

View logs:

journalctl -u mirror-dedupe.service

Nginx Configuration

See nginx/mirror.conf for an example nginx configuration.

License

MIT License - see LICENSE file for details.

Author

Tim Hosking tim@mungerware.com

https://github.com/munger/mirror-dedupe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mirror_dedupe-0.2.4.tar.gz (30.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mirror_dedupe-0.2.4-py3-none-any.whl (22.2 kB view details)

Uploaded Python 3

File details

Details for the file mirror_dedupe-0.2.4.tar.gz.

File metadata

  • Download URL: mirror_dedupe-0.2.4.tar.gz
  • Upload date:
  • Size: 30.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mirror_dedupe-0.2.4.tar.gz
Algorithm Hash digest
SHA256 a8024cb7a7289845521e14f8f20190e5f81bb245e60060b013a77b38c4fb323b
MD5 b142e029e690be1b4fd9c4dcd5722956
BLAKE2b-256 4e4663a47f877188bf1e11fd3b038d32f982b87f3c5145b476461de1ece6d110

See more details on using hashes here.

File details

Details for the file mirror_dedupe-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: mirror_dedupe-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 22.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mirror_dedupe-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6ff2b35de3baffcb12c891b8fa3d65f1327bf069780350457c1e7b7d646007fa
MD5 5cbef899ac3537e3440b6efd8948e20e
BLAKE2b-256 ca0f953cd607c8e5aa5f0bc890e247b8dce11adad7338fa3627b50eab32a7403

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page