savepagenow

A simple Python wrapper for archive.org's "Save Page Now" capturing service

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Project description

A simple Python wrapper for archive.org’s “Save Page Now” capturing service

Installation

$ pipenv install savepagenow

Python Usage

Import it.

>>> import savepagenow

Capture a URL.

>>> archive_url = savepagenow.capture("http://www.example.com/")

See where it’s stored.

>>> print archive_url
https://web.archive.org/web/20161018203554/http://www.example.com/

If a URL has been recently cached, archive.org may return the URL to that page rather than conduct a new capture. When that happens, the capture method will raise a CachedPage exception.

This is likely happen if you request the same URL twice within a few seconds.

>>> savepagenow.capture("http://www.example.com/")
'https://web.archive.org/web/20161019062637/http://www.example.com/'
>>> savepagenow.capture("http://www.example.com/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "savepagenow/__init__.py", line 36, in capture
    archive_url
savepagenow.exceptions.CachedPage: archive.org returned a cached version of this page: https://web.archive.org/web/20161019062637/http://www.example.com/

You can craft your code to catch that exception yourself, or use the built-in capture_or_cache method, which will return the URL provided by archive.org along with a boolean indicating if it is a fresh capture (True) or from the cache (False).

>>> savepagenow.capture_or_cache("http://www.example.com/")
('https://web.archive.org/web/20161019062832/http://www.example.com/', True)
>>> savepagenow.capture_or_cache("http://www.example.com/")
('https://web.archive.org/web/20161019062832/http://www.example.com/', False)

There’s no accounting for taste but you could craft a line to handle that command like so:

>>> url, captured = savepagenow.capture_or_cache("http://www.example.com/")

Command-line usage

The Python library is also installed as a command-line interface. You can run it from your terminal like so:

$ savepagenow http://www.example.com/

The command has the same options as the Python API, which you can learn about from its help output.

$ savepagenow --help
Usage: savepagenow [OPTIONS] URL

  Archives the provided URL using the archive.org Wayback Machine.

  Raises a CachedPage exception if archive.org declines to conduct a new
  capture and returns a previous snapshot instead.

Options:
  -ua, --user-agent TEXT  User-Agent header for the web request
  -c, --accept-cache      Accept and return cached URL
  --help                  Show this message and exit.

Customizing the user agent

In an effort to be transparent and polite to the Internet Archive, all requests made by savepagenow carry a custom user agent that identifies itself as "savepagenow (https://github.com/pastpages/savepagenow)".

You can further customize this setting by using the optional arguments to our API.

Here’s how to do it in Python:

>>> savepagenow.capture("http://www.example.com/", user_agent="my user agent here")

And here’s how to do it from the command line:

$ savepagenow http://www.example.com/ --user-agent "my user agent here"

Project details

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

1.3.1

Jan 19, 2026

1.3.0

Jul 1, 2023

1.2.3

Jun 26, 2022

1.2.2

Apr 3, 2022

1.2.1

Mar 28, 2022

1.2.0

Mar 28, 2022

1.1.1

Dec 27, 2020

1.1.0

Sep 8, 2020

This version

1.0.1

Jul 12, 2020

1.0.0

Jul 12, 2020

0.0.13

Jan 21, 2019

0.0.12

Nov 25, 2018

0.0.11

Nov 25, 2018

0.0.10

Jun 28, 2018

0.0.9

May 14, 2018

0.0.8

Oct 24, 2016

0.0.7

Oct 22, 2016

0.0.6

Oct 22, 2016

0.0.5

Oct 19, 2016

0.0.4

Oct 19, 2016

0.0.2

Oct 18, 2016

0.0.1

Oct 18, 2016

0.0.0

Oct 18, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

savepagenow-1.0.1.tar.gz (5.5 kB view details)

Uploaded Jul 12, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

savepagenow-1.0.1-py2.py3-none-any.whl (6.1 kB view details)

Uploaded Jul 12, 2020 Python 2Python 3

File details

Details for the file savepagenow-1.0.1.tar.gz.

File metadata

Download URL: savepagenow-1.0.1.tar.gz
Upload date: Jul 12, 2020
Size: 5.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for savepagenow-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`5da911ec11866335b2e8d89ceec5260b88431932cf40077ad59b3404dc5b8134`
MD5	`69068cf2ba8519013ed17c273ef93f88`
BLAKE2b-256	`af7eaf76b4362bf30706b21b0b44927b70eeb4a57ed6c38a2550f0c038d4f199`

See more details on using hashes here.

File details

Details for the file savepagenow-1.0.1-py2.py3-none-any.whl.

File metadata

Download URL: savepagenow-1.0.1-py2.py3-none-any.whl
Upload date: Jul 12, 2020
Size: 6.1 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for savepagenow-1.0.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`68bf11df1b71554bd889e3068fbc70f32f658bc12918a0fefa062c302a0f1dfd`
MD5	`f03bc280efead3fb6e87c0e7add0e9af`
BLAKE2b-256	`c4aceffc03d61b786b081dd6b1268ba7a4ed0b69f004f90a32ea98f5f1444359`

See more details on using hashes here.

savepagenow 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Python Usage

Command-line usage

Customizing the user agent

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes