Skip to main content

High-performance, async, parallel HTTP downloader for Python 3

Project description

HyperFetch: Asynchronous HTTP Downloader

License Python Version

HyperFetch is a feature-rich, asynchronous HTTP downloader, designed for high-performance parallel downloads. It offers a wide range of features including retry logic, rate limiting, chunked downloads, progress tracking, and more.

Features

  • Asynchronous Parallel Downloads: Download multiple URLs concurrently for maximum efficiency.
  • Retry Logic: Configurable retry mechanism with exponential backoff for handling transient errors.
  • Timeout Control: Granular control over connection and read timeouts.
  • Rate Limiting: Prevent server overload with configurable rate limiting.
  • Chunked Downloads: Support for downloading large files in chunks using HTTP Range requests.
  • Progress Tracking: Monitor download progress with progress callbacks.
  • Content Validation: Verify downloaded content using checksums.
  • Caching: Avoid redundant downloads with a built-in caching mechanism.
  • Redirect Handling: Control redirect behavior.
  • SSL/TLS Verification: Configurable SSL/TLS verification settings.
  • Custom Headers: Customize request headers.
  • Cookies: Support for storing and sending cookies.
  • Download Queues: Manage large numbers of URLs with download queues.
  • Download Scheduling: Schedule downloads for specific times.
  • Plugin System: Extend functionality with custom plugins.
  • Logging: Comprehensive logging for debugging and monitoring.
  • Proxy Support: HTTP/HTTPS and SOCKS5 proxy support.
  • Skip URL functionality: Skip specific URLs based on a callback.

Installation

pip install hyperfetch-py

Quick Usage

import asyncio
from hyper_fetch.downloader import AsyncDownloader
from hyper_fetch.types import DownloadRequest


async def main():
    req = DownloadRequest.make("https://httpbin.org/headers")
    downloader = AsyncDownloader()
    result = await downloader.download(req)
    print(result.content)


if __name__ == "__main__":
    asyncio.run(main())

Usage

import asyncio
from pathlib import Path
from typing import List

import aiofiles

from hyper_fetch.downloader import AsyncDownloader
from hyper_fetch.types import (
    RetryConfig,
    ChunkConfig,
    SSLConfig,
    DownloadRequest,
    ProgressInfo,
)


def download(
        urls: List[str],
        output_dir: str,
        concurrency: int,
        retry: int,
        timeout: int,
        chunk_size: int,
        verify: bool,
):
    """Download files from URLs"""
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)

    # Configure downloader
    retry_config = RetryConfig(max_attempts=retry)
    chunk_config = ChunkConfig(enabled=True, size=chunk_size)
    ssl_config = SSLConfig(verify=verify)

    # Progress callback
    def show_progress(url: str, progress: ProgressInfo):
        if progress.total_bytes:
            percentage = (progress.bytes_downloaded / progress.total_bytes) * 100
            print(f"{url}: {percentage:.1f}% complete")

    async def run_downloads():
        downloader = AsyncDownloader(concurrency=concurrency, retry_config=retry_config)
        downloader.add_progress_callback(show_progress)

        # Create download requests
        requests = [
            DownloadRequest(
                url=url,
                context={"output_path": output_path / Path(url).name},
                chunk_config=chunk_config,
                ssl=ssl_config,
            )
            for url in urls
        ]

        # Download files
        results = await downloader.download_many(requests)

        # Save files
        for result in results:
            if result.error:
                print(f"Error downloading {result.url}: {result.error}")
                continue

            output_file = result.context["output_path"]
            async with aiofiles.open(output_file, "wb") as f:
                await f.write(result.content)

            print(f"Downloaded {result.url} to {output_file}")

    asyncio.run(run_downloads())

Configuration

You can configure HyperFetch using the following classes:

  • RetryConfig: Configure retry behavior.
  • TimeoutConfig: Configure connection and read timeouts.
  • RateLimiter: Configure rate limiting.
  • AsyncDownloader: Main downloader class with various configuration options.

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue.

License

Copyright Dr. Masroor Ehsan 2025.

Distributed under the MIT License. See the LICENSE file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperfetch_py-0.0.1a0.tar.gz (41.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hyperfetch_py-0.0.1a0-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file hyperfetch_py-0.0.1a0.tar.gz.

File metadata

  • Download URL: hyperfetch_py-0.0.1a0.tar.gz
  • Upload date:
  • Size: 41.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for hyperfetch_py-0.0.1a0.tar.gz
Algorithm Hash digest
SHA256 c504c5f13e9835758293344a09dcde65b0e992d8d0ce6f0ce9b7344e581415b0
MD5 c19c2e5a6533ca96c112e3343b279a60
BLAKE2b-256 8104c0a83932a7172e120f8084a669aa2460a5a13fe69b114c1842905a670ad7

See more details on using hashes here.

File details

Details for the file hyperfetch_py-0.0.1a0-py3-none-any.whl.

File metadata

File hashes

Hashes for hyperfetch_py-0.0.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 24774bebf8f34b5eaa25356c63a9764fead48b679a43f9b2a64363619d4eef7f
MD5 3c203725667eeb3152bfda3fdf58995c
BLAKE2b-256 43ebad67a4a99928a8ab6851496c59c773a06fb63f25e5e9096695b36c2cd1a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page