Download all crates on https://crates.io
Project description
CratesMirror
About
Download all crates on crates.io
Requirement
Installation
# pip install cratesmirror
Usage
$ cratesmirror -h
usage: cratesmirror [-h] [-i INDEX] [-w CRATES] [-d DBPATH] [-f LOGFILE]
[-c CHECKDB] [-v]
optional arguments:
-h, --help show this help message and exit
-i, --index INDEX registry index directory (default: /srv/git/index)
-w, --crates CRATES crates directory (default: /srv/www/crates)
-d, --dbpath DBPATH database file path (default: None)
-f, --logfile LOGFILE
log file path (default: None)
-c, --checkdb CHECKDB
check database for missing crates (default: False)
-v, --verbose
Available environment variables: HTTP_PROXY, HTTPS_PROXY, CRATES_DL, CRATES_API
Examples:
# Download all crates only
$ cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
# Find out missing crates, then update the repository
$ cratesmirror --checkdb -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
# Update repo and commit custom settings
$ CRATES_DL='https://crates.mirrors.ustc.edu.cn/api/v1/crates' \
cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
# Using proxy
$ HTTPS_PROXY='https://127.0.0.1:8081' \
cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
Or use it in script
from cratesmirror import CratesMirror
indexdir = '/srv/git/index'
cratesdir = '/srv/www/crates'
config = {'dl': 'https://crates.mirrors.ustc.edu.cn/api/v1/crates',
'api': 'https://crates.io'}
# By default, it will be saved at os.getcwd()/crates.db
dbpath = '/var/lib/cratesmirror/crates.db'
with CratesMirror(indexdir, cratesdir, config=config, dbpath=dbpath) as mirror:
mirror.update_repo()
# with proxy
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
with CratesMirror(indexdir, cratesdir, config=config, proxy=proxies, dbpath=dbpath) as mirror:
mirror.update_repo()
Note
- By default, the script will:
assume that registry index directory is located at
/srv/git/index
, and crates are saved at/srv/www/crates
save downloaded crate as
<CratesDir>/{name}/{name}-{version}.crate
save the database file at
os.getcwd()/crates.db
If the environment variable
CRATES_DL
orCRATES_API
is set, its value will be saved at<IndexDir>/config.json
and the changes will be committed automatically.After the first run, all you need to do is to run this script periodically using crontab-like tools or systemd.timers to sync with upstream.
ChangeLog
1.1.1
Miscellaneous
Add changelog
1.1.0
Improvement
Always download crates using multithreading
1.0.4
Feature
Add -c/–checkdb option, enable users to check database for missing crates
1.0.3
Improvement
When <CratesDir> is empty, download all crates in a multithreaded way
1.0.2
Naive crawler
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.