Skip to main content

FTP and Web spiders and mirroring utilities

Project description

This module provides FTP and Web spiders and mirroring utilities in one convenient script. FTP sites are crawled by walking a remote directory tree. Websites are crawled by visiting URLs extracted from HTML pages. Downloads during mirroring are multi-threaded for supported protocols. Other features included listing bad URLs and the URL containing them and finding horrifically malformed HTML. Requires Python 2.3

Project details

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page