Skip to main content

A command-line tool to fetch files from websites recursively

Project description

FetchAnything

A command-line tool to fetch files from websites recursively.

Installation

You can install FetchAnything using pip:

pip install fetchanything

Or from source:

git clone https://github.com/yourusername/fetchanything.git
cd fetchanything
pip install -e .

Usage

Basic usage:

fetchanything <URL> [options]

Options

  • -l, --level LEVEL: Maximum crawl depth (default: 2)
  • -f, --filter PATTERN: File pattern to match (e.g., ".pdf", ".jpg")
  • -u, --url-pattern PATTERN: Regex pattern to match URLs for crawling (e.g., "./blog/.")
  • -o, --out DIRECTORY: Output directory (default: downloads)
  • -v, --verbose: Enable verbose output

Examples

  1. Download all PDF files from a website up to depth 2:
fetchanything https://example.com --level 2 --filter "*.pdf" --out download_pdf
  1. Download all files from a website up to depth 1:
fetchanything https://example.com --level 1 --out downloads
  1. Download all images with verbose output:
fetchanything https://example.com --filter "*.jpg" -v
  1. Download PDFs only from blog pages:
fetchanything https://example.com --filter "*.pdf" --url-pattern ".*/blog/.*"
  1. Download files only from specific subdomain:
fetchanything https://example.com --url-pattern "https://docs\\.example\\.com/.*"

Features

  • Recursive website crawling with depth control
  • File pattern matching
  • URL pattern filtering
  • Progress tracking with tqdm
  • Verbose logging option
  • Persistent HTTP sessions
  • Error handling and graceful interruption

Requirements

  • Python 3.7 or higher
  • requests
  • beautifulsoup4
  • tqdm
  • urllib3

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fetchanything-0.2.0.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fetchanything-0.2.0-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file fetchanything-0.2.0.tar.gz.

File metadata

  • Download URL: fetchanything-0.2.0.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for fetchanything-0.2.0.tar.gz
Algorithm Hash digest
SHA256 be0cf4c8fb003c36ea4a4250c019d8be8d294f3c2f82cbbba4b7cd8828ddf1fd
MD5 963ffcfc6e98183a1fdfc3cb98c3cef9
BLAKE2b-256 2ecaf18cb91d0b3af40ccc5209cf4927032917d306b9336bce124403f4585a39

See more details on using hashes here.

Provenance

The following attestation bundles were made for fetchanything-0.2.0.tar.gz:

Publisher: python-publish.yml on chaochungkuo/fetchanything

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fetchanything-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: fetchanything-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for fetchanything-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d4f8d913125e09463b452ef58576dd8a6ee6d0deb47b7f73793c20f8fbac8e1e
MD5 e41d286962ecf2d99627a33c2746abfc
BLAKE2b-256 eea2a0d4d28066586d7a38dd3bfe2570c6f6815f8c2815e7800a1597f4a5863f

See more details on using hashes here.

Provenance

The following attestation bundles were made for fetchanything-0.2.0-py3-none-any.whl:

Publisher: python-publish.yml on chaochungkuo/fetchanything

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page