Skip to main content

FTP and Web spiders and mirroring utilities

Project description

This module provides FTP and Web spiders and mirroring utilities in one convenient script. FTP sites are crawled by walking a remote directory tree. Websites are crawled by visiting URLs extracted from HTML pages. Downloads during mirroring are multi-threaded for supported protocols. Other features included listing bad URLs and the URL containing them and finding horrifically malformed HTML. Requires Python 2.3

Project details


Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page