Skip to main content

Ubuntu mirror synchronisation with global deduplication

Project description

Ubuntu/Debian mirror synchronisation with intelligent deduplication using hardlinks.

Features

  • Downloads from upstream and hardlinks duplicate files (same SHA256 hash) to save bandwidth and disk space
  • Supports multiple mirrors with global deduplication across all mirrors
  • Uses curl for duplicate files, rsync for unique files and metadata
  • Configurable via YAML files
  • Systemd timer support for automated synchronisation

Installation

Option 1: Debian/Ubuntu Package (Recommended)

Download the latest .deb package from GitHub Releases:

wget https://github.com/munger/mirror-dedupe/releases/download/v0.2.0/mirror-dedupe_0.2.0-1_all.deb
sudo dpkg -i mirror-dedupe_0.2.0-1_all.deb

This includes systemd integration, man pages, and proper package management.

Option 2: PyPI (All Linux Distributions)

pip install mirror-dedupe

Then install systemd files manually:

sudo ./install.sh --pip

Option 3: From Source

git clone https://github.com/munger/mirror-dedupe.git
cd mirror-dedupe
sudo ./install.sh

Configuration

Configuration files are located in /etc/mirror-dedupe/:

  • mirror-dedupe.conf - Global settings
  • repos-available/ - Available repository configurations
  • repos-enabled/ - Enabled repositories (symlinks to repos-available)

Adding a Repository

Use the scanner to auto-generate configuration for a repository, for example:

mirror-dedupe-scan --name grafana --dest grafana https://apt.grafana.com

See config/repos-available/README.md for the full mirror-dedupe-scan reference and ready-made commands for all of the packaged example repositories.

Then test and enable it using the CLI:

# Test that the upstream and GPG key URL (if configured) are reachable
mirror-dedupe --test grafana

# If the test looks good, activate the repository
mirror-dedupe --activate grafana

If you prefer, you can still enable it manually with a symlink:

cd /etc/mirror-dedupe/repos-enabled
ln -s ../repos-available/grafana.conf .

Advanced: Alternative config directories

By default, both tools use /etc/mirror-dedupe. You can override this with --config, e.g. for testing or multiple instances:

mirror-dedupe --config /tmp/mirror-test --test grafana

Pre-configured Repositories

The package includes pre-configured repositories:

  • ubuntu - Ubuntu main archive (noble)
  • ubuntu-ports - Ubuntu ports archive (noble)
  • ubuntu-cloud - Ubuntu Cloud Archive (selected OpenStack tracks on noble)
  • debian - Debian stable archive (bookworm)
  • docker - Docker packages for Ubuntu noble
  • grafana - Grafana APT repository
  • influxdb - InfluxData repository for Debian/Ubuntu
  • kubernetes - Kubernetes packages from apt.kubernetes.io
  • nginx - Official NGINX packages for Ubuntu
  • nodesource-node22 - Node.js 22.x LTS from NodeSource
  • postgresql - PostgreSQL APT repository (noble-pgdg)

Usage

# Sync all mirrors
mirror-dedupe

# Sync specific mirror
mirror-dedupe --mirror ubuntu

# Dry run
mirror-dedupe --dry-run

# Dedupe only (no sync)
mirror-dedupe --dedupe-only

Systemd Integration

If installed via Debian package, systemd is already configured. Otherwise:

sudo systemctl enable --now mirror-dedupe.timer
sudo systemctl status mirror-dedupe.timer

View logs:

journalctl -u mirror-dedupe.service

Nginx Configuration

See nginx/mirror.conf for an example nginx configuration.

License

MIT License - see LICENSE file for details.

Author

Tim Hosking tim@mungerware.com

https://github.com/munger/mirror-dedupe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mirror_dedupe-0.2.8.tar.gz (47.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mirror_dedupe-0.2.8-py3-none-any.whl (33.0 kB view details)

Uploaded Python 3

File details

Details for the file mirror_dedupe-0.2.8.tar.gz.

File metadata

  • Download URL: mirror_dedupe-0.2.8.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mirror_dedupe-0.2.8.tar.gz
Algorithm Hash digest
SHA256 1fcead14312efa8ace05ccd76118f9c47fe5de20e02fa4f15b2b0ab52cf043fd
MD5 0a7027a0f42283782052a9efaa1fbea2
BLAKE2b-256 af8028c960c6e665375fa6de89ba8c502d51fad6c7e38178476c9c6a0a9e2761

See more details on using hashes here.

File details

Details for the file mirror_dedupe-0.2.8-py3-none-any.whl.

File metadata

  • Download URL: mirror_dedupe-0.2.8-py3-none-any.whl
  • Upload date:
  • Size: 33.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mirror_dedupe-0.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 efa204983c89119b0d248146ffd9469a00901aa66c9e80d0acd24f63655b1e3e
MD5 3340ffdf61ab49c1a41253785897c6e9
BLAKE2b-256 7140de4356bc24374194c4e027b3290f9e101b4b0a94f7f1d727a66e448a277c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page