Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Download all crates on https://crates.io

Project Description

CratesMirror

About

Download all crates on crates.io

Requirement

  • Python >= 2.7.9
  • requests
  • GitPython
  • At least 4G free disk space for hosting local crates

Installation

# pip install cratesmirror

Usage

$ cratesmirror -h

usage: cratesmirror [-h] [-i INDEX] [-w CRATES] [-d DBPATH] [-f LOGFILE]
                    [-c CHECKDB] [-v]

optional arguments:
  -h, --help            show this help message and exit
  -i, --index INDEX     registry index directory (default: /srv/git/index)
  -w, --crates CRATES   crates directory (default: /srv/www/crates)
  -d, --dbpath DBPATH   database file path (default: None)
  -f, --logfile LOGFILE
                        log file path (default: None)
  -c, --checkdb CHECKDB
                        check database for missing crates (default: False)
  -v, --verbose

Available environment variables: HTTP_PROXY, HTTPS_PROXY, CRATES_DL, CRATES_API


Examples:
# Download all crates only
$ cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log

# Find out missing crates, then update the repository
$ cratesmirror --checkdb -d /var/lib/crates/crates.db -f /var/log/crates/debug.log

# Update repo and commit custom settings
$ CRATES_DL='https://crates.mirrors.ustc.edu.cn/api/v1/crates' \
      cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log

# Using proxy
$ HTTPS_PROXY='https://127.0.0.1:8081' \
      cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log

Or use it in script

from cratesmirror import CratesMirror

indexdir = '/srv/git/index'
cratesdir = '/srv/www/crates'
config = {'dl': 'https://crates.mirrors.ustc.edu.cn/api/v1/crates',
          'api': 'https://crates.io'}
# By default, it will be saved at os.getcwd()/crates.db
dbpath = '/var/lib/cratesmirror/crates.db'

with CratesMirror(indexdir, cratesdir, config=config, dbpath=dbpath) as mirror:
    mirror.update_repo()

# with proxy
proxies = {
  "http": "http://10.10.1.10:3128",
  "https": "http://10.10.1.10:1080",
}
with CratesMirror(indexdir, cratesdir, config=config, proxy=proxies, dbpath=dbpath) as mirror:
    mirror.update_repo()

Note

  • By default, the script will:
    • assume that registry index directory is located at /srv/git/index, and crates are saved at /srv/www/crates
    • save downloaded crate as <CratesDir>/{name}/{name}-{version}.crate
    • save the database file at os.getcwd()/crates.db
  • If the environment variable CRATES_DL or CRATES_API is set, its value will be saved at <IndexDir>/config.json and the changes will be committed automatically.
  • After the first run, all you need to do is to run this script periodically using crontab-like tools or systemd.timers to sync with upstream.

ChangeLog

1.1.3

Bugfix

  • Pick up all new crates in the modified index file, not only the latest

1.1.2

Bugfix

  • Replace None with default value in <RegistryDir>/config.json

1.1.1

Miscellaneous

  • Add changelog

1.1.0

Improvement

  • Always download crates using multithreading

1.0.4

Feature

  • Add -c/–checkdb option, enable users to check database for missing crates

1.0.3

Improvement

  • When <CratesDir> is empty, download all crates in a multithreaded way

1.0.2

  • Naive crawler
Release History

Release History

This version
History Node

1.1.3

History Node

1.1.2

History Node

1.1.1

History Node

1.1.0

History Node

1.0.4

History Node

1.0.3

History Node

1.0.2

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
cratesmirror-1.1.3.tar.gz (11.4 kB) Copy SHA256 Checksum SHA256 Source Apr 14, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting