py-downx

Flexible download manager

These details have not been verified by PyPI

Project links

Homepage

Project description

Introduction
Installation
Quick Start
- Command Line Interface
Core Data Types & Enums
Download Manager API (DownloadManager)
Advanced Features
Utility Functions
License

1. Introduction

pydown is a flexible Python library for managing file downloads. It provides a unified, high-level API to handle downloads over multiple protocols, including HTTP(S), FTP, and SFTP, with robust support for advanced features like concurrency, download resuming, and speed limiting.

Features

Multi-Protocol Support: Natively handles HTTP, HTTPS, FTP, and SFTP URLs.
Concurrent Downloads: Download multiple files simultaneously using an efficient asynchronous worker pool.
Pause & Resume: Pause downloads and resume them later, even after the application restarts.
Error Handling & Retries: Automatically retries failed downloads with configurable exponential backoff.
Speed Limiting: Throttle download bandwidth to a specified maximum rate.
Real-time Monitoring: Use observers to get live feedback on download progress, speed, status changes, and errors.
Duplicate Handling: Configure strategies (skip, overwrite, rename) for handling duplicate download requests.

2. Installation

Dependencies

pydown depends on the following libraries, which will be installed automatically: httpx, validators, humanize, dataclasses-json, and paramiko.

Installation

Install via PyPI:

pip install pydown

(Note: As this is a hypothetical package name for this context, you would typically install the package you've created from its source or a repository).

3. Quick Start

This example demonstrates how to download a file using the DownloadManager.

import time
from pydown import DownloadManager, create_download_request

# 1. Initialize the Download Manager
# This will manage a queue of downloads with up to 3 concurrent workers.
manager = DownloadManager(max_concurrent_downloads=3)

# 2. Create a download request for a test file
# The file will be saved as '100MB.bin' in the current directory.
request = create_download_request(
    name="Large Test File",
    url="http://speedtest.tele2.net/100MB.zip",
    file_path="100MB.zip"
)

# 3. Add the request to the manager's queue
manager.add_download(request)
print("Download added to the queue.")

# 4. Start the download workers
manager.start()
print("Download manager started.")

# 5. Wait for all downloads to complete
manager.wait_for_completion()
print("All downloads have finished.")

# 6. Stop the manager and clean up resources
manager.stop()
print("Manager stopped.")

3.1. Command Line Interface

PyDown includes a powerful command-line tool that provides all the library's functionality through an easy-to-use CLI interface.

Installation with CLI Support

After installing pydown, the pydown command will be available in your terminal:

pip install pydown
pydown --help

Basic Usage

# Download a single file
pydown https://example.com/file.zip

# Download with custom output path
pydown https://example.com/file.zip -o /path/to/save/file.zip

# Download multiple files concurrently
pydown url1 url2 url3 -c 5 -d ./downloads/

# Batch download from a file containing URLs
pydown --batch urls.txt -d ./downloads/

Advanced CLI Features

Concurrent Downloads and Performance

# Set maximum concurrent downloads and segments
pydown https://example.com/largefile.zip -c 3 -s 8 --speed-limit 1000000

Authentication and Headers

# HTTP headers and cookies (as JSON)
pydown https://api.example.com/data.json --headers '{"Authorization": "Bearer token"}'

# FTP/SFTP with credentials
pydown ftp://user:pass@server/file.txt
pydown sftp://user:pass@server/file.txt

Session Management

# Save download session for later resuming
pydown https://example.com/file.zip --save-session mysession.json

# Resume previous session
pydown --resume mysession.json

Batch Operations

Create a text file with URLs (one per line):

# my_downloads.txt
https://example.com/file1.zip
https://example.com/file2.pdf
https://cdn.example.com/data.json

Then download all files:

pydown --batch my_downloads.txt -d ./downloads/ -c 5

Output Control

# Quiet mode (no progress bars)
pydown https://example.com/file.zip -q

# Verbose output with detailed logging
pydown https://example.com/file.zip -v

# Log to file
pydown https://example.com/file.zip --log-file downloads.log

Error Handling and Retries

# Configure retry behavior and timeouts
pydown https://example.com/file.zip --retries 5 --timeout 60

# Handle duplicate files (skip, overwrite, or rename)
pydown https://example.com/file.zip --duplicate rename

CLI Options Reference

Option	Description	Default
`urls`	URLs to download (positional arguments)	-
`-o, --output`	Output file path (for single downloads)	Auto-generated
`-d, --directory`	Output directory	Current directory
`-c, --concurrent`	Maximum concurrent downloads	3
`-s, --segments`	Maximum segments per download	8
`--speed-limit`	Speed limit in bytes per second	Unlimited
`--timeout`	Connection timeout in seconds	30
`--retries`	Maximum retry attempts	3
`--duplicate`	Duplicate handling (`skip`, `overwrite`, `rename`)	`skip`
`--headers`	HTTP headers as JSON string	None
`--cookies`	HTTP cookies as JSON string	None
`--proxy`	Proxy URL	None
`--batch`	File containing URLs to download	None
`--save-session`	Save session to JSON file	None
`--resume`	Resume from saved session file	None
`-q, --quiet`	Suppress progress output	False
`-v, --verbose`	Verbose output	False
`--no-progress`	Disable progress bars	False
`--log-file`	Log to file	None

Examples

Simple Download:
```
pydown https://example.com/file.zip
```

Multiple Files with Custom Settings:

pydown https://site1.com/file1.zip https://site2.com/file2.pdf \
       -d ~/Downloads/ -c 4 -s 6 --verbose

Authenticated Download:

pydown https://api.example.com/data.json \
       --headers '{"Authorization": "Bearer your-token"}' \
       --cookies '{"session": "abc123"}'

Batch Download with Session Save:

pydown --batch large_downloads.txt \
       --save-session backup.json \
       -d ./downloads/ -c 5 --verbose

Resume Interrupted Downloads:
```
pydown --resume backup.json
```

4. Core Data Types & Enums

`DownloadRequest`

A dataclass that holds all configuration and state for a single download. It is the central object you create and pass to the DownloadManager.

name: str: A human-readable name for the download.
url: str: The URL of the file to download.
file_path: str: The local path where the file will be saved.
status: DownloadStatus: The current status of the download (e.g., PENDING, COMPLETED).
priority: int: A numerical priority (higher numbers are processed first).
headers: Dict[str, str]: Custom HTTP headers.
max_retries: int: Maximum number of times to retry on failure.
speed_limit: Optional[int]: Speed limit in bytes per second.
checksum: Optional[str]: The expected checksum string for validation.
checksum_type: str: The algorithm to use (md5, sha1, sha256).
ftp_username: Optional[str]: Username for FTP/SFTP authentication.
ftp_password: Optional[str]: Password for FTP/SFTP authentication.

`ProgressInfo`

A dataclass passed to observers during progress updates.

total_size: int: Total size of the file in bytes.
downloaded_size: int: Number of bytes downloaded so far.
speed: float: Current download speed in bytes per second.
eta: float: Estimated time remaining in seconds.
progress_percent: float: Download progress as a percentage (0-100).

`DownloadStatus`

An Enum representing the state of a DownloadRequest.

PENDING: The download is waiting to be processed.
QUEUED: The download is in the queue, ready for a worker.
IN_PROGRESS: The download is actively being processed by a worker.
PAUSED: The download has been manually paused.
COMPLETED: The download finished successfully.
FAILED: The download failed after all retries.
CANCELLED: The download was cancelled by the user.
DUPLICATE: The download was skipped because it was identified as a duplicate.

5. Download Manager API (`DownloadManager`)

The DownloadManager is the main entry point for orchestrating all download operations.

Initialization & Lifecycle

__init__(self, max_concurrent_downloads: int = 3, duplicate_strategy: str = "skip", log_file: Optional[str] = None, quiet: bool = False)
- Initializes the manager.
- max_concurrent_downloads: The number of downloads to run in parallel.
- duplicate_strategy: How to handle duplicates: "skip", "overwrite", "rename".
- log_file: Path to a file for logging output.
- quiet: If True, suppresses console logging.

Adding & Managing Downloads

add_download(self, request: DownloadRequest) -> str
- Adds a single DownloadRequest to the queue. Returns the request URL as its unique ID.
add_downloads_from_json(self, json_file: str) -> List[str]
- Loads and adds multiple download requests from a JSON file.
pause_download(self, url: str) -> bool
- Pauses an active or pending download identified by its URL.
resume_download(self, url: str) -> bool
- Resumes a paused download.
cancel_download(self, url: str) -> bool
- Cancels a download. The partial file is not deleted.
export_downloads(self, json_file: str)
- Saves the state of all current downloads to a JSON file.

Controlling the Manager

start(self)
- Starts the worker threads to process the download queue.
stop(self)
- Stops the workers and cleans up resources. This should be called to ensure a graceful exit.
wait_for_completion(self)
- Blocks until the download queue is empty and all active downloads are finished.

Monitoring & Observers

add_observer(self, observer: DownloadObserver)
- Registers a custom observer to receive real-time events.
remove_observer(self, observer: DownloadObserver)
- Unregisters an observer.
get_download_status(self, url: str) -> Optional[DownloadRequest]
- Retrieves the current state of a specific download.
get_all_downloads(self) -> Dict[str, DownloadRequest]
- Returns a dictionary of all downloads managed by the instance.

6. Advanced Features

Monitoring with Observers

Create a custom class that inherits from DownloadObserver to react to download events.

from pydown import DownloadObserver, DownloadRequest, ProgressInfo, DownloadStatus

class MyCustomObserver(DownloadObserver):
    def on_progress(self, request: DownloadRequest, progress: ProgressInfo):
        print(f"[{request.name}] {progress.progress_percent:.1f}% at {progress.speed / 1024:.1f} KB/s")

    def on_status_change(self, request: DownloadRequest, old_status: DownloadStatus, new_status: DownloadStatus):
        print(f"[{request.name}] Status changed: {new_status.name}")

    def on_error(self, request: DownloadRequest, error: Exception):
        print(f"[{request.name}] An error occurred: {error}")

# Add it to the manager
manager = DownloadManager()
my_observer = MyCustomObserver()
manager.add_observer(my_observer)

Protocol-Specific Configuration

You can specify protocol-specific details, like FTP credentials, directly on the DownloadRequest object.

from pydown import create_download_request

ftp_request = create_download_request(
    name="FTP File",
    url="ftp://speedtest.tele2.net/1MB.zip",
    file_path="1MB.zip",
    ftp_username="anonymous",
    ftp_password="user@example.com"
)

manager.add_download(ftp_request)

Resume, Retry, and Duplicate Handling

Resume: Resuming is enabled by default. pydown creates a .partial file and will automatically pick up where it left off if the download is interrupted.
Retry: The manager automatically retries downloads on connection errors or server-side issues (HTTP 5xx). Configure this with max_retries on the DownloadRequest.
Duplicates: The duplicate_strategy on the DownloadManager controls behavior when a download is added that is identical to a previously completed one (based on URL, size, and checksum).

7. Utility Functions

pydown provides helper functions to simplify common tasks.

create_download_request(name: str, url: str, **kwargs) -> DownloadRequest
- A convenient factory to create a DownloadRequest object.
cookies_from_requests_session(session: 'requests.Session') -> Dict[str, str]
- Extracts cookies from a requests.Session object to use in a DownloadRequest.
headers_from_requests_session(session: 'requests.Session') -> Dict[str, str]
- Extracts headers from a requests.Session object.

Example:

import requests
from pydown import create_download_request, cookies_from_requests_session

# Log in to a site using the requests library
session = requests.Session()
session.post("https://example.com/login", data={"user": "...", "pass": "..."})

# Create a download request using the session's cookies
request = create_download_request(
    name="Authenticated Download",
    url="https://example.com/file.zip",
    cookies=cookies_from_requests_session(session)
)

8. License

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.1.0

Jun 24, 2025

1.0.1

Jun 24, 2025

This version

1.0.0

Jun 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_downx-1.0.0.tar.gz (19.7 kB view details)

Uploaded Jun 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

py_downx-1.0.0-py3-none-any.whl (23.2 kB view details)

Uploaded Jun 24, 2025 Python 3

File details

Details for the file py_downx-1.0.0.tar.gz.

File metadata

Download URL: py_downx-1.0.0.tar.gz
Upload date: Jun 24, 2025
Size: 19.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for py_downx-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`a75ee856f84652912a7d54090199a25729ff2bb6b958d8c692efd780d43b0248`
MD5	`8cea0243dfd5da5287e04115bb05b9e7`
BLAKE2b-256	`39c40fa6b88c3b856273af02791993441d60b98395cf1b2667a742bf52339ab5`

See more details on using hashes here.

File details

Details for the file py_downx-1.0.0-py3-none-any.whl.

File metadata

Download URL: py_downx-1.0.0-py3-none-any.whl
Upload date: Jun 24, 2025
Size: 23.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for py_downx-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`70d2af8c51bb598b7b5175fdf553981d1cfe8d1d1c90ecb9d180eee0a466119e`
MD5	`f61414ed467c1f19c5bf27b5942463e4`
BLAKE2b-256	`3b8d67e398d026d0049b85ff9c2e420ac5ef7e4052ee6b64e90d5dd75aaf11f6`

See more details on using hashes here.

py-downx 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Table of Contents

1. Introduction

Features

2. Installation

Dependencies

Installation

3. Quick Start

3.1. Command Line Interface

Installation with CLI Support

Basic Usage

Advanced CLI Features

Concurrent Downloads and Performance

Authentication and Headers

Session Management

Batch Operations

Output Control

Error Handling and Retries

CLI Options Reference

Examples

4. Core Data Types & Enums

DownloadRequest

ProgressInfo

DownloadStatus

5. Download Manager API (DownloadManager)

Initialization & Lifecycle

Adding & Managing Downloads

Controlling the Manager

Monitoring & Observers

6. Advanced Features

Monitoring with Observers

Protocol-Specific Configuration

Resume, Retry, and Duplicate Handling

7. Utility Functions

8. License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`DownloadRequest`

`ProgressInfo`

`DownloadStatus`

5. Download Manager API (`DownloadManager`)