htmldata

Extract and modify HTML/CSS URLs, translate HTML documents <-> list data structures.

These details have not been verified by PyPI

Project links

Project description

The htmldata module allows one to translate HTML documents back and forth to list data structures. This allows for programmatic reading and writing of HTML documents, with much flexibility.

Functions are also available for extracting and/or modifying all URLs present in the HTML or stylesheets of a document.

I have found this library useful for writing robots, for “wrapping” all of the URLs on websites inside my own proxy CGI script, for filtering HTML, and for doing flexible wget-like mirroring.

It keeps things as simple as possible, so it should be easy to learn.

Supports XHTML, too.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.1

Oct 26, 2007

1.1.0

Sep 15, 2005

1.0.9

Sep 15, 2005

1.0.7

Apr 27, 2005

1.0.6

Feb 7, 2005

1.0.5

Feb 7, 2005

1.0.4

Dec 11, 2004

htmldata 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed