Skip to main content

Throw all URIs in a page on to Wayback Machine from CLI.

Project description

wbsv

PyPI version Codacy Badge Maintainability MIT License Downloads Downloads Downloads

wbsv(stands for "WayBack machine SavepageNow") is…

CLI tool for saving webpage on Wayback Machine forever. Enables you to save all URIs in a webpage forever on Wayback Machine.

Install

pip install wbsv

DEMO

demo.gif

Run & Examples

Help

$ wbsv -h
usage: wbsv [-h] [-r times] [-t] [-l level] [-V] [url [url ...]]

CLI tool for save webpage on Wayback Machine forever.
Save webpage and one 's all URI(s) on Wayback Machine.

positional arguments:
  url                   Saving pages in order.

optional arguments:
  -h, --help            show this help message and exit
  -r times, --retry times
                        Set a retry limit on failed save.(>=0
  -t, --only_target     Save just target webpage(s).
  -l level, --level level
                        Set maximum recursion depth. (>0)
  -V, --version         show program's version number and exit

If you don't give the URL,
interactive mode will be launched.
(To quit interactive mode,
type "end", "exit", "exit()",
"break", "bye", ":q" or "finish".)

Interactive mode

$ wbsv
[[Input a target url (ex: https://google.com)]]
>>> https://www.u.tsukuba.ac.jp
[+]Target: ['https://www.u.tsukuba.ac.jp']
[+]61 URI(s) found.
[01/60]: <NOW> https://web.archive.org/web/20200412020015/https://www.u.tsukuba.ac.jp/password/
[02/60]: <FAIL> https://www.u.tsukuba.ac.jp/info_lit/tebiki.html
[03/60]: <NOW> https://web.archive.org/web/20200412020026/https://www.u.tsukuba.ac.jp/account/
...
[58/60]: <NOW> https://web.archive.org/web/20200412022608/https://www.u.tsukuba.ac.jp/phishing/
[59/60]: <FAIL> https://www.u.tsukuba.ac.jp/wordpress/wp-content/uploads/note_usingcomputerrooms.png
[60/60]: <NOW> https://web.archive.org/web/20200412022640/https://www.u.tsukuba.ac.jp/
[+]FIN!: ['https://www.u.tsukuba.ac.jp']
[+]ALL: 60, SAVE: 57, PAST: 0, FAIL: 3
>>>

From stdin

$ wbsv https://tsumanne.net
[+]Target: ['https://tsumanne.net']
[+]4 URI(s) found.
[1/4]: <NOW> https://web.archive.org/web/20200412022931/https://tsumanne.net/si/
[2/4]: <NOW> https://web.archive.org/web/20200412022935/https://tsumanne.net/
[3/4]: <NOW> https://web.archive.org/web/20200412022938/https://tsumanne.net/my/
[4/4]: <NOW> https://web.archive.org/web/20200412022949/https://tsumanne.net/ct/
[+]FIN!: ['https://tsumanne.net']
[+]ALL: 4, SAVE: 4, PAST: 0, FAIL: 0
$

Search links recurcively

wbsv https://programming-place.net/ppp/contents/c/index.html -l 2

Increase limit of retry

wbsv https://tsumanne.net -r 10

LISENCE

MIT

Author

eggplants (haruna)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for wbsv, version 0.4.1
Filename, size File type Python version Upload date Hashes
Filename, size wbsv-0.4.1-py3-none-any.whl (6.9 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size wbsv-0.4.1.tar.gz (6.4 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page