Download all crates on https://crates.io
Project description
CratesMirror
About
Download all crates on crates.io
Requirement
Installation
# pip install cratesmirror
Usage
$ cratesmirror -h
usage: cratesmirror [-h] [-i INDEX] [-w CRATES] [-d DBPATH] [-f LOGFILE]
[-c CHECKDB] [-v]
optional arguments:
-h, --help show this help message and exit
-i, --index INDEX registry index directory (default: /srv/git/index)
-w, --crates CRATES crates directory (default: /srv/www/crates)
-d, --dbpath DBPATH database file path (default: None)
-f, --logfile LOGFILE
log file path (default: None)
-c, --checkdb CHECKDB
check database for missing crates (default: False)
-v, --verbose
Available environment variables: HTTP_PROXY, HTTPS_PROXY, CRATES_DL, CRATES_API
Examples:
# Download all crates only
$ cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
# Find out missing crates, then update the repository
$ cratesmirror --checkdb -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
# Update repo and commit custom settings
$ CRATES_DL='https://crates.mirrors.ustc.edu.cn/api/v1/crates' \
cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
# Using proxy
$ HTTPS_PROXY='https://127.0.0.1:8081' \
cratesmirror -d /var/lib/crates/crates.db -f /var/log/crates/debug.log
Or use it in script
from cratesmirror import CratesMirror
indexdir = '/srv/git/index'
cratesdir = '/srv/www/crates'
config = {'dl': 'https://crates.mirrors.ustc.edu.cn/api/v1/crates',
'api': 'https://crates.io'}
# By default, it will be saved at os.getcwd()/crates.db
dbpath = '/var/lib/cratesmirror/crates.db'
with CratesMirror(indexdir, cratesdir, config=config, dbpath=dbpath) as mirror:
mirror.update_repo()
# with proxy
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
with CratesMirror(indexdir, cratesdir, config=config, proxy=proxies, dbpath=dbpath) as mirror:
mirror.update_repo()
Note
- By default, the script will:
assume that registry index directory is located at
/srv/git/index
, and crates are saved at/srv/www/crates
save downloaded crate as
<CratesDir>/{name}/{name}-{version}.crate
save the database file at
os.getcwd()/crates.db
If the environment variable
CRATES_DL
orCRATES_API
is set, its value will be saved at<IndexDir>/config.json
and the changes will be committed automatically.After the first run, all you need to do is to run this script periodically using crontab-like tools or systemd.timers to sync with upstream.
ChangeLog
1.1.3
Bugfix
Pick up all new crates in the modified index file, not only the latest
1.1.2
Bugfix
Replace None with default value in
<RegistryDir>/config.json
1.1.1
Miscellaneous
Add changelog
1.1.0
Improvement
Always download crates using multithreading
1.0.4
Feature
Add -c/–checkdb option, enable users to check database for missing crates
1.0.3
Improvement
When <CratesDir> is empty, download all crates in a multithreaded way
1.0.2
Naive crawler
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.