Skip to main content

Throw all URIs in a page on to Wayback Machine savepagenow from CLI.

Project description

wbsv

PyPI version Codacy Badge Maintainability MIT License Downloads Downloads Downloads

  • wbsv("Wabisavi", "わびさび", stands for "WayBack machine SavepageNow") is...

    • CLI tool for saving webpage on Wayback Machine forever.
    • Enables you to save all URIs in a webpage forever on Wayback Machine.
  • You can try this tool on Google Cloud Shell. (First, sudo pip3 install -e .)

Open in Cloud Shell

DEMO

demo.gif

Install

$ pip install wbsv # Python3.0+

Run & Examples

Help

$ wbsv -h
wbsv 0.0.9
CLI tool for save webpage on Wayback Machine forever.
Save webpage and one's all URI(s) on Wayback Machine.

Usage:
    wbsv [options] <url1> <url2> ... <urln>

Args:
    <urls>                      Saving pages in order.
    no arg                      Launch Interactive mode.
                                (To quit interactive mode,
                                 type "end", "exit", "exit()",
                                 "break", "bye", ":q" or "finish".)

Options:
    -h, --help                  Show help and exit.
    -v, --version               Show version and exit.
    -r, --retry <times>         Set a retry limit on failed save.
        --only-target           Save just target webpage(s).
    -l, --log-to-file <file>    Write STDOUT to a specified file.
    -L, --recursive <level>     Set maximum recursion depth.
        --only-page             Get only URIs of type web page.

Interactive mode

$ wbsv
[[Input a target url (ex: https://google.com)]]
>>> https://tsukuba.ac.jp
[!]Now: https://tsukuba.ac.jp
[!]class 'urllib.error.URLError'
[!]urlopen error [Errno -2] Name or service not known
[!]traceback object at 0x7eff0d207188
[[Input a target url (ex: https://google.com)]]
>>> https://www.u.tsukuba.ac.jp
[+]Now: https://www.u.tsukuba.ac.jp
87 URI(s) found.
[01]: <NOW> https://web.archive.org/web/20200123135244/https://www.u.tsukuba.ac.jp/20180622terminals/
[02]: <NOW> https://web.archive.org/web/20200123135247/https://www.u.tsukuba.ac.jp/
[03]: <NOW> https://web.archive.org/web/20200123135250/https://www.u.tsukuba.ac.jp/anti-virus/
...
[85]: <NOW> https://web.archive.org/web/20200123140917/https://www.u.tsukuba.ac.jp/snapshot/
[86]: <FAIL> https://www.u.tsukuba.ac.jp/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fwww.u.tsukuba.ac.jp%2F&format=xml
[87]: <FAIL> https://www.u.tsukuba.ac.jp/info_lit/tebiki.html
[+]FIN!: https://www.u.tsukuba.ac.jp
[+]ALL: 87 SAVE: 61 FAIL: 21
[+]To exit, use CTRL+C or type 'end'
[[Input a target url (ex: https://google.com)]]
>>> exit
$

From stdin

$ wbsv https://tsumanne.net https://tsumanne.net/ct
[+]Now: https://tsumanne.net
9 URI(s) found.
[1]: <NOW> https://web.archive.org/web/20200123194439/https://tsumanne.net
...
[9]: <FAIL> https://tsumanne.net/src/iphone.png
[+]FIN!: https://tsumanne.net
[+]ALL: 9 SAVE: 5 FAIL: 4
[+]Now: https://tsumanne.net/ct
7 URI(s) found.
[1]: <NOW> https://web.archive.org/web/20200123194602/https://tsumanne.net/ct/?cat=&of=25
...
[7]: <FAIL> https://tsumanne.net/src/site.js
[+]FIN!: https://tsumanne.net/ct
[+]ALL: 7 SAVE: 5 FAIL: 2
$

Increase limit of retry

$ wbsv https://tsumanne.net --retry 10

VERSION

wbsv 0.0.9

LISENCE

MIT

Author

eggplants (haruna)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for wbsv, version 0.0.9
Filename, size File type Python version Upload date Hashes
Filename, size wbsv-0.0.9-py3-none-any.whl (8.5 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size wbsv-0.0.9.tar.gz (6.0 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page