Skip to main content

Python API to Internet Archive Wayback Machine

Project description

Build Status Download Latest Version from PyPI Code of Conduct Documentation Status

Wayback is A Python API to the Internet Archive’s Wayback Machine. It gives you tools to search for and load mementos (historical copies of web pages).

The Internet Archive maintains an official “internetarchive” Python package, but it does not focus on the Wayback Machine. Instead, it is mainly concerned with the APIs and tools that manage the Internet Archive as a whole: managing items and collections. These are how e-books, audio recordings, movies, and other content in the Internet Archive are managed. It doesn’t, however, provide particularly good tools for finding or loading historical captures of specific URLs (i.e. the part of the Internet Archive called the “Wayback Machine”). That’s what this package does.

Installation & Basic Usage

Install via pip on the command line:

$ pip install wayback

Then, in a Python script, import it and create a client:

import wayback
client = wayback.WaybackClient()

Finally, search for all the mementos of nasa.gov before 1999 and download them:

for record in client.search('http://nasa.gov', to_date=date(1999, 1, 1)):
    memento = client.get_memento(record)

Read the full documentation for a more in-depth tutorial and complete API reference documentation at https://wayback.readthedocs.io/

Code of Conduct

This repository falls under EDGI’s Code of Conduct. Please take a moment to review it before commenting on or creating issues and pull requests.

Contributors

Thanks to the following people for their contributions and help on this package! See our contributing guidelines to find out how you can help.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wayback-0.5.0.tar.gz (884.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wayback-0.5.0-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file wayback-0.5.0.tar.gz.

File metadata

  • Download URL: wayback-0.5.0.tar.gz
  • Upload date:
  • Size: 884.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wayback-0.5.0.tar.gz
Algorithm Hash digest
SHA256 329e68fd7456b1c17649e7815c5df26456ec77883c6b577d1b2de8fbce4b93a5
MD5 15f5d8e5d33a401cbc3ec3184ec09dd0
BLAKE2b-256 700badc81965586a08d96d7a6e2922da950bd7b969b259eb581aa1dba3ca5147

See more details on using hashes here.

Provenance

The following attestation bundles were made for wayback-0.5.0.tar.gz:

Publisher: cd.yml on edgi-govdata-archiving/wayback

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wayback-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: wayback-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 32.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wayback-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d85673751a9f6f2894318a74ed9b9dfccc270b2dc1c76635bf89295fd504b7c
MD5 d3c595c5ad86da67c76f9ca16f370783
BLAKE2b-256 e84e8f11f39436879dd9e0e71d23608ae6cb12999772ae7e4f389fde46b7be8c

See more details on using hashes here.

Provenance

The following attestation bundles were made for wayback-0.5.0-py3-none-any.whl:

Publisher: cd.yml on edgi-govdata-archiving/wayback

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page