Extract information from HTML pages that have some kind of a repetitive pattern
Project description
This package finds repetitive format patterns in an HTML page that contains one or more lists and extracts the sub-html text that creates the patterns. The idea is that in a typical HTML data page containing a list of items, there will be a repetitive pattern for the human eye (the page format). This pattern can be recognized automatically, and the data in the list can be extracted.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
HtmlList-2.2.2.zip
(393.0 kB
view hashes)
HtmlList-2.2.2.tar.gz
(359.5 kB
view hashes)
Built Distribution
HtmlList-2.2.2-py2.6.egg
(456.6 kB
view hashes)