Python API to Internet Archive Wayback Machine
Project description
Wayback is A Python API to the Internet Archive’s Wayback Machine. It gives you tools to search for and load mementos (historical copies of web pages).
The Internet Archive maintains an official “internetarchive” Python package, but it does not focus on the Wayback Machine. Instead, it is mainly concerned with the APIs and tools that manage the Internet Archive as a whole: managing items and collections. These are how e-books, audio recordings, movies, and other content in the Internet Archive are managed. It doesn’t, however, provide particularly good tools for finding or loading historical captures of specific URLs (i.e. the part of the Internet Archive called the “Wayback Machine”). That’s what this package does.
- Documentation:
Current Release: https://wayback.readthedocs.io/en/stable/
Development: https://wayback.readthedocs.io/en/latest/
Installation & Basic Usage
Install via pip on the command line:
$ pip install wayback
Then, in a Python script, import it and create a client:
import wayback
client = wayback.WaybackClient()
Finally, search for all the mementos of nasa.gov before 1999 and download them:
for record in client.search('http://nasa.gov', to_date=date(1999, 1, 1)):
memento = client.get_memento(record)
Read the full documentation for a more in-depth tutorial and complete API reference documentation at https://wayback.readthedocs.io/
Code of Conduct
This repository falls under EDGI’s Code of Conduct. Please take a moment to review it before commenting on or creating issues and pull requests.
Contributors
Thanks to the following people for their contributions and help on this package! See our contributing guidelines to find out how you can help.
Dan Allan (Code, Tests, Documentation, Reviews)
Rob Brackett (Code, Tests, Documentation, Reviews)
David Gilman (Documentation)
Will Sackfield (Code, Tests)
Ed Summers (Code, Tests)
Lion Szlagowski (Code, Tests)
License & Copyright
Copyright (C) 2019-2023 Environmental Data and Governance Initiative (EDGI)
This program is free software: you can redistribute it and/or modify it under the terms of the 3-Clause BSD License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file wayback-0.4.5.tar.gz
.
File metadata
- Download URL: wayback-0.4.5.tar.gz
- Upload date:
- Size: 72.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f260a38ad8e317d9089fde86049151705a1bb4674e85b3da3a9f0204c5d4f55 |
|
MD5 | 30033b560c0ccd94ae12bf0aec623080 |
|
BLAKE2b-256 | 5e2c50cf23834aea1de728967a535ce94f65b432ba95b36110aee3b1dd3204f2 |
File details
Details for the file wayback-0.4.5-py3-none-any.whl
.
File metadata
- Download URL: wayback-0.4.5-py3-none-any.whl
- Upload date:
- Size: 41.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4de2a65c99045ea99255f37ef2deeca445d94c612cf6fb7e78d46c9bb6b36e55 |
|
MD5 | 355a58184e08df99baefccbd0a0910b6 |
|
BLAKE2b-256 | 040c3ac76773d81beb57696097eefe98d90137d263e57bb6a6406b9238b273a8 |