High-performance, async, parallel HTTP downloader for Python 3

These details have not been verified by PyPI

Project links

Operating System
- OS Independent
Programming Language

Project description

HyperFetch: Asynchronous HTTP Downloader

HyperFetch is a feature-rich, asynchronous HTTP downloader, designed for high-performance parallel downloads. It offers a wide range of features including retry logic, rate limiting, chunked downloads, progress tracking, and more.

Features

Asynchronous Parallel Downloads: Download multiple URLs concurrently for maximum efficiency.
Retry Logic: Configurable retry mechanism with exponential backoff for handling transient errors.
Timeout Control: Granular control over connection and read timeouts.
Rate Limiting: Prevent server overload with configurable rate limiting.
Chunked Downloads: Support for downloading large files in chunks using HTTP Range requests.
Progress Tracking: Monitor download progress with progress callbacks.
Content Validation: Verify downloaded content using checksums.
Caching: Avoid redundant downloads with a built-in caching mechanism.
Redirect Handling: Control redirect behavior.
SSL/TLS Verification: Configurable SSL/TLS verification settings.
Custom Headers: Customize request headers.
Cookies: Support for storing and sending cookies.
Download Queues: Manage large numbers of URLs with download queues.
Download Scheduling: Schedule downloads for specific times.
Plugin System: Extend functionality with custom plugins.
Logging: Comprehensive logging for debugging and monitoring.
Proxy Support: HTTP/HTTPS and SOCKS5 proxy support.
Skip URL functionality: Skip specific URLs based on a callback.

Installation

pip install hyperfetch-py

Quick Usage

import asyncio
from hyper_fetch.downloader import AsyncDownloader
from hyper_fetch.types import DownloadRequest


async def main():
    req = DownloadRequest.make("https://httpbin.org/headers")
    downloader = AsyncDownloader()
    result = await downloader.download(req)
    print(result.content)


if __name__ == "__main__":
    asyncio.run(main())

Usage

import asyncio
from pathlib import Path
from typing import List

import aiofiles

from hyper_fetch.downloader import AsyncDownloader
from hyper_fetch.types import (
    RetryConfig,
    ChunkConfig,
    SSLConfig,
    DownloadRequest,
    ProgressInfo,
)


def download(
        urls: List[str],
        output_dir: str,
        concurrency: int,
        retry: int,
        timeout: int,
        chunk_size: int,
        verify: bool,
):
    """Download files from URLs"""
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)

    # Configure downloader
    retry_config = RetryConfig(max_attempts=retry)
    chunk_config = ChunkConfig(enabled=True, size=chunk_size)
    ssl_config = SSLConfig(verify=verify)

    # Progress callback
    def show_progress(url: str, progress: ProgressInfo):
        if progress.total_bytes:
            percentage = (progress.bytes_downloaded / progress.total_bytes) * 100
            print(f"{url}: {percentage:.1f}% complete")

    async def run_downloads():
        downloader = AsyncDownloader(concurrency=concurrency, retry_config=retry_config)
        downloader.add_progress_callback(show_progress)

        # Create download requests
        requests = [
            DownloadRequest(
                url=url,
                context={"output_path": output_path / Path(url).name},
                chunk_config=chunk_config,
                ssl=ssl_config,
            )
            for url in urls
        ]

        # Download files
        results = await downloader.download_many(requests)

        # Save files
        for result in results:
            if result.error:
                print(f"Error downloading {result.url}: {result.error}")
                continue

            output_file = result.context["output_path"]
            async with aiofiles.open(output_file, "wb") as f:
                await f.write(result.content)

            print(f"Downloaded {result.url} to {output_file}")

    asyncio.run(run_downloads())

Configuration

You can configure HyperFetch using the following classes:

RetryConfig: Configure retry behavior.
TimeoutConfig: Configure connection and read timeouts.
RateLimiter: Configure rate limiting.
AsyncDownloader: Main downloader class with various configuration options.

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue.

License

Distributed under the MIT License. See the LICENSE file for more information.

Project details

These details have not been verified by PyPI

Project links

Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

0.0.1a0 pre-release

Feb 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperfetch_py-0.0.1a0.tar.gz (41.3 kB view details)

Uploaded Feb 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hyperfetch_py-0.0.1a0-py3-none-any.whl (13.8 kB view details)

Uploaded Feb 23, 2025 Python 3

File details

Details for the file hyperfetch_py-0.0.1a0.tar.gz.

File metadata

Download URL: hyperfetch_py-0.0.1a0.tar.gz
Upload date: Feb 23, 2025
Size: 41.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for hyperfetch_py-0.0.1a0.tar.gz
Algorithm	Hash digest
SHA256	`c504c5f13e9835758293344a09dcde65b0e992d8d0ce6f0ce9b7344e581415b0`
MD5	`c19c2e5a6533ca96c112e3343b279a60`
BLAKE2b-256	`8104c0a83932a7172e120f8084a669aa2460a5a13fe69b114c1842905a670ad7`

See more details on using hashes here.

File details

Details for the file hyperfetch_py-0.0.1a0-py3-none-any.whl.

File metadata

Download URL: hyperfetch_py-0.0.1a0-py3-none-any.whl
Upload date: Feb 23, 2025
Size: 13.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for hyperfetch_py-0.0.1a0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`24774bebf8f34b5eaa25356c63a9764fead48b679a43f9b2a64363619d4eef7f`
MD5	`3c203725667eeb3152bfda3fdf58995c`
BLAKE2b-256	`43ebad67a4a99928a8ab6851496c59c773a06fb63f25e5e9096695b36c2cd1a2`

See more details on using hashes here.

hyperfetch-py 0.0.1a0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HyperFetch: Asynchronous HTTP Downloader

Features

Installation

Quick Usage

Usage

Configuration

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes