Skip to main content

Module for reading Apple's .webarchive files

Project description

pywebarchive is a Python 3 module for reading Apple's .webarchive files. It includes Webarchive Extractor, a tool to convert those files to standard HTML pages that you can open in any browser.

pywebarchive is stable enough for everyday use. It remains in beta because its support for the .webarchive format is a work in progress; in particular, some pages using advanced HTML5 features may not convert perfectly.

Webarchive Extractor

Builds for Windows are available on the releases page on GitHub. These are standalone executables that run on Windows 7 and higher. On other platforms, Webarchive Extractor is included with the pywebarchive source code.

Information for Developers

Here's an example of how to use the webarchive module:

import webarchive
archive = webarchive.open("example.webarchive")
archive.extract("example.html")

For detailed documentation, try python3 -m pydoc webarchive.

The source distribution also includes two webarchive extraction tools:

  • extractor.py is a command-line version.
  • extractor-gui.py is a GUI version using Tkinter.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywebarchive-0.3.0.tar.gz (12.7 kB view hashes)

Uploaded Source

Built Distribution

pywebarchive-0.3.0-py3-none-any.whl (15.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page