FTP and Web spiders and mirroring utilities
Project description
This module provides FTP and Web spiders and mirroring utilities in one convenient script. FTP sites are crawled by walking a remote directory tree. Websites are crawled by visiting URLs extracted from HTML pages. Downloads during mirroring are multi-threaded for supported protocols. Other features included listing bad URLs and the URL containing them and finding horrifically malformed HTML. Requires Python 2.3