Flardl
Project description
Flardl - Adaptive Multi-Site Downloading of Lists
Who would flardls bear?
Features
Flardl adaptively downloads a list of files from a list of federated web servers. Federated, in this case, means that one can download the same file from each server in the list. For lists of a few hundred or more files of ~1MB in size, the download speed can approach Gbit/s line limits, typically 300X higher than a curl-based script.
The main speed-up is obtained by asynchronous I/O; the use of multiple servers provides stability and dynamic adaptability in the face of unknown server loads and net weather.
The name flardl could be either an acronym involving downloading, or a nonsense word. You pick.
Theory
Much has been written under the rubric of queueing theory, which we are purposefully discarding here. We take a semi-empirical approach based around chemical rate theory; this is in many was inadequate, but a full model of the downloading process requires prior knowledge (such as file sizes) we don't usually possess.
The simplest version of downloading simply launches every request as quickly as possible at a single server and lets the server handle the queueing. The server handles overlapped responses, which quickly drives the bandwidth to the maximum set by intervening network policy and hardware (such as ISP throttling), where it stays until all requests are completed. However, several effects make this a bit too simple:
-
Most servers will apply a policy that dumps requests once the queue depth gets too high. These requests must be re-queued, and if that is done stupidly then the available bandwidth gets eaten with unfulfilled requests. The queue depth policy is not known in advance, and it may depend on total queue depth that includes other users.
-
A server may decide you are executing a Denial-Of-Service (DOS) attack and respond by severely throttling further requests from your IP address. This throttling can last for hours or days, or even permanent black-listing. This "death penalty" can sometimes be triggered by activity of other users at the same institution that hides behind the same public IP address. I have seen practical classes brought to a complete halt by
Requirements
Flardl is tested under python 3.11, on Linux, MacOS, and Windows and under 3.9 and 3.10 on Linux. Under the hood, flardl relies on [https://www.python-httpx.org/] [httpx] and is supported on whatever platforms that library works under, for both HTTP/1.1 and HTTP/2. HTTP/3 support could easily be added via [https://github.com/aiortc/aioquic][aioquic] once enough servers are running HTTP/3 to make that worthwhile.
Installation
You can install Flardl via pip from PyPI:
$ pip install flardl
Usage
Please see the [Command-line Reference] for details.
Contributing
Contributions are very welcome. To learn more, see the Contributor Guide.
License
Distributed under the terms of the BSD 3-clause_license, Flardl is free and open source software.
Issues
If you encounter any problems, please file an issue along with a detailed description.
Credits
Flardl was written by Joel Berendzen.
This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file flardl-0.0.3.tar.gz
.
File metadata
- Download URL: flardl-0.0.3.tar.gz
- Upload date:
- Size: 17.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf242d3d4c20cba4f0cdc3ba1478f8f17d46283f1be899df539e675c374863ca |
|
MD5 | 696c7fb02a3d38da8eb6959cb811885b |
|
BLAKE2b-256 | 92cc1924365020408f2d84d3b0d7b5f9418ffdf0d3e045ea95acc39c21df860f |
File details
Details for the file flardl-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: flardl-0.0.3-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3dd175c61e4fd8ce89e557cee5526f19d9decb46560d245f7135396f3a4d9b21 |
|
MD5 | 050b815e7831e7fe9c0245028d78bec0 |
|
BLAKE2b-256 | 8902e60abbf4e016cdde71fdc973abc534a29a6a711105bd37962623234802b3 |