Skip to main content

Throw all URIs in a page on to Wayback Machine from CLI.

Project description

wbsv

PyPI version Codacy Badge Maintainability MIT License Downloads Downloads Downloads

wbsv(stands for "WayBack machine SavepageNow") is…

CLI tool for saving webpage on Wayback Machine forever. Enables you to save all URIs in a webpage forever on Wayback Machine.

Try now

You can try this tool on Google Cloud Shell. (First, sudo python3 -m pip install -e .)

Open in Cloud Shell

DEMO

demo.gif

Install

$ python -m pip install wbsv

Run & Examples

Help

$ wbsv -V
wbsv 0.2.3
$ wbsv -h
usage: wbsv [-h] [-V] [-r cnt] [-t] [-L lv] [url [url ...]]

CLI tool for save webpage on Wayback Machine forever.
Save webpage and one's all URI(s) on Wayback Machine.

positional arguments:
  url                  Saving pages in order.

optional arguments:
  -h, --help           show this help message and exit
  -V, --version        Show version and exit
  -r cnt, --retry cnt  Set a retry limit on failed save.
  -t, --only_target    Save just target webpage(s).
  -L lv, --level lv    Set maximum recursion depth.

additional information:
    If you don't give the URL,
    interactive mode will be launched.
    (To quit interactive mode,
     type "end", "exit", "exit()",
     "break", "bye", ":q" or "finish".)

Interactive mode

$ wbsv
[[Input a target url (ex: https://google.com)]]
>>> https://www.u.tsukuba.ac.jp
[+]Now: https://www.u.tsukuba.ac.jp
[+]60 URI(s) found.
[01/60]: <NOW> https://web.archive.org/web/20200412020015/https://www.u.tsukuba.ac.jp/password/
[02/60]: <FAIL> https://www.u.tsukuba.ac.jp/info_lit/tebiki.html
[03/60]: <NOW> https://web.archive.org/web/20200412020026/https://www.u.tsukuba.ac.jp/account/
...
[58/60]: <NOW> https://web.archive.org/web/20200412022608/https://www.u.tsukuba.ac.jp/phishing/
[59/60]: <FAIL> https://www.u.tsukuba.ac.jp/wordpress/wp-content/uploads/note_usingcomputerrooms.png
[60/60]: <NOW> https://web.archive.org/web/20200412022640/https://www.u.tsukuba.ac.jp/
[+]FIN!: https://www.u.tsukuba.ac.jp
[+]ALL: 60 SAVE: 57 FAIL: 3
[+]To exit, use CTRL+C or type 'end'
[[Input a target url (ex: https://google.com)]]
>>> exit
[+]End.
$

From stdin

$ wbsv https://tsumanne.net https://tsumanne.net/ct
[+]Now: https://tsumanne.net
[+]4 URI(s) found.
[1/4]: <NOW> https://web.archive.org/web/20200412022931/https://tsumanne.net/si/
[2/4]: <NOW> https://web.archive.org/web/20200412022935/https://tsumanne.net/
[3/4]: <NOW> https://web.archive.org/web/20200412022938/https://tsumanne.net/my/
[4/4]: <NOW> https://web.archive.org/web/20200412022949/https://tsumanne.net/ct/
[+]FIN!: https://tsumanne.net
[+]ALL: 4 SAVE: 4 FAIL: 0
[+]Now: https://tsumanne.net/ct
[+]3 URI(s) found.
[1/3]: <NOW> https://web.archive.org/web/20200412022958/https://tsumanne.net/
[2/3]: <NOW> https://web.archive.org/web/20200412023000/https://tsumanne.net/oa_login.php
[3/3]: <NOW> https://web.archive.org/web/20200412023012/https://tsumanne.net/ct/?cat=&of=25
[+]FIN!: https://tsumanne.net/ct
[+]ALL: 3 SAVE: 3 FAIL: 0
$

Search links recurcively

$ wbsv https://programming-place.net/ppp/contents/c/index.html -L2

Increase limit of retry

$ wbsv https://tsumanne.net --retry 10

VERSION

wbsv 0.2.3

LISENCE

MIT

Author

eggplants (haruna)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wbsv-0.2.3.tar.gz (8.8 kB view hashes)

Uploaded Source

Built Distribution

wbsv-0.2.3-py3-none-any.whl (11.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page