Throw all URIs in a page on to Wayback Machine savepagenow from CLI.
Project description
wbsv
wbsv
("Wabisavi", "わびさび", stands for "WayBack machine SavepageNow") is…
CLI tool for saving webpage on Wayback Machine forever. Enables you to save all URIs in a webpage forever on Wayback Machine.
Try now
You can try this tool on Google Cloud Shell. (First, sudo python3 -m pip install -e .
)
DEMO
Install
$ pip install wbsv # Python3.0+
Run & Examples
Help
$ wbsv -h
wbsv 0.1.6
CLI tool for save webpage on Wayback Machine forever.
Save webpage and one's all URI(s) on Wayback Machine.
Usage:
wbsv [options] <url1> <url2> ... <urln>
Args:
<urls> Saving pages in order.
no arg Launch Interactive mode.
(To quit interactive mode,
type "end", "exit", "exit()",
"break", "bye", ":q" or "finish".)
Options:
-h, --help Show help and exit.
-v, --version Show version and exit.
-r, --retry <times> Set a retry limit on failed save.
-t, --only-target Save just target webpage(s).
-L, --level <depth> Set maximum recursion depth.
Interactive mode
$ wbsv
[[Input a target url (ex: https://google.com)]]
>>> https://www.u.tsukuba.ac.jp
[+]Now: https://www.u.tsukuba.ac.jp
[+]60 URI(s) found.
[01/60]: <NOW> https://web.archive.org/web/20200412020015/https://www.u.tsukuba.ac.jp/password/
[02/60]: <FAIL> https://www.u.tsukuba.ac.jp/info_lit/tebiki.html
[03/60]: <NOW> https://web.archive.org/web/20200412020026/https://www.u.tsukuba.ac.jp/account/
...
[58/60]: <NOW> https://web.archive.org/web/20200412022608/https://www.u.tsukuba.ac.jp/phishing/
[59/60]: <FAIL> https://www.u.tsukuba.ac.jp/wordpress/wp-content/uploads/note_usingcomputerrooms.png
[60/60]: <NOW> https://web.archive.org/web/20200412022640/https://www.u.tsukuba.ac.jp/
[+]FIN!: https://www.u.tsukuba.ac.jp
[+]ALL: 60 SAVE: 57 FAIL: 3
[+]To exit, use CTRL+C or type 'end'
[[Input a target url (ex: https://google.com)]]
>>> exit
[+]End.
$
From stdin
$ wbsv https://tsumanne.net https://tsumanne.net/ct
[+]Now: https://tsumanne.net
[+]4 URI(s) found.
[1/4]: <NOW> https://web.archive.org/web/20200412022931/https://tsumanne.net/si/
[2/4]: <NOW> https://web.archive.org/web/20200412022935/https://tsumanne.net/
[3/4]: <NOW> https://web.archive.org/web/20200412022938/https://tsumanne.net/my/
[4/4]: <NOW> https://web.archive.org/web/20200412022949/https://tsumanne.net/ct/
[+]FIN!: https://tsumanne.net
[+]ALL: 4 SAVE: 4 FAIL: 0
[+]Now: https://tsumanne.net/ct
[+]3 URI(s) found.
[1/3]: <NOW> https://web.archive.org/web/20200412022958/https://tsumanne.net/
[2/3]: <NOW> https://web.archive.org/web/20200412023000/https://tsumanne.net/oa_login.php
[3/3]: <NOW> https://web.archive.org/web/20200412023012/https://tsumanne.net/ct/?cat=&of=25
[+]FIN!: https://tsumanne.net/ct
[+]ALL: 3 SAVE: 3 FAIL: 0
$
Search links recurcively
$ wbsv -L2 https://programming-place.net/ppp/contents/c/index.html
Increase limit of retry
$ wbsv https://tsumanne.net --retry 10
VERSION
wbsv 0.1.6
LISENCE
MIT
Author
eggplants (haruna)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wbsv-0.1.6.tar.gz
(5.9 kB
view hashes)
Built Distribution
wbsv-0.1.6-py3-none-any.whl
(8.3 kB
view hashes)