The parallel downloader.
Project description
Scylla
Scyla downloads files in parallel.
It is aimed at speeding up concurrent downloads, not at being as fast as possible per download.
There are probably a bunch of tools out there that are faster, as that is not scylas goal.
The reason scylla exists is because I had to repeatedly download a bunch of files and it took a while with wget.
It can be used from the command line or as importable module.
From the commandline it is quite easy to use:
kazaamjt@workstation:~$ scylla -f url_list -o artifacts/
Alternativly you can import the module from in a script.
CLI Options
As mentioned, the CLI is quite simple, it does not have many options:
kazaamjt@workstation:~$ scylla --help
Usage: scylla [OPTIONS]
Options:
-f, --file TEXT Path to a file containing urls.
-o, --output-dir TEXT Directory to store the retrieved files.
-s, --max-concurrent INTEGER Maximum number of simultanious files to
download. Setting this to 0 will make scylla
try to download all the files at once.
--help Show this message and exit.
Using as a module
Using as a module is also not very difficult.
Do note that the module uses async, so it does have some gotchas.
Also note that the module is fully typed, hopefully making its usage easier.
To use it simply import the module and instantiate the Downloader class:
import asyncio
from pathlib import Path
import scylla
urls = [
"https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.10.17.tar.xz",
"https://example.com/",
]
async def main() -> None:
downloader = Downloader(urls, Path("."))
await downloader.start()
if __name__ == "__main__":
asyncio.run(main())
The most important gotcha here is making sure the downloader class is instantiated in
the same event-loop as Downloader.start is called from.
We do this here by wrapping them in a single async function.
And that's it really.
The class has a couple more init parameters that are more for advanced usage:
- max_concurrent: int = 5
Same as the CLI, this changes the maximum number of downloads that run simultaniously.
- chunk_report_cb: Optional[ChunkReportCallback] = None
The function passed to this parameter will be called everytime a chunk was succesfully downloaded and saved.
Usefull for tracking download progress.
- report_done_cb: Optional[Callable[[str], None]] = None
The function passed to this parameter will be called everytime a download is completed.
The parameter being passed back is the name of the file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scylla-http-1.0.0.tar.gz.
File metadata
- Download URL: scylla-http-1.0.0.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21bf3724e24c31673eb252d10d635b456774991466573dff5ad9f6cea5a2b205
|
|
| MD5 |
d2c9eed4bd2d3bb09e54c0abf164c71e
|
|
| BLAKE2b-256 |
f970a798a00ce346c0dff61cd4ac0785dc1c9ba64894e3a7cf0abc67caaf15c3
|
File details
Details for the file scylla_http-1.0.0-py3-none-any.whl.
File metadata
- Download URL: scylla_http-1.0.0-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0932a15d840d2b58cc7c92b9af496772e2da718b9a6fb9c9cb715949ddd1fbf0
|
|
| MD5 |
f0dabaaed928f351ed8d40b62cad9f3b
|
|
| BLAKE2b-256 |
77af17383aab18cf2ccec2106d42f572391710997c41e160d0b6583af49a50ff
|