A multi-threaded aria2-like batch file downloading library for Python
Project description
bdownload
A multi-threaded aria2-like batch file downloading library for Python
Installation
-
via PyPI
pip install bdownload
-
from within source directory locally
pip install .
Note that you should
git clone
or download the source tarball (and unpack it of course) from the repository first
Usage: as a Python package
Importing
from bdownload import BDownloader
or
import bdownload
Signatures
class bdownload.BDownloader(max_workers=None, min_split_size=1024*1024, chunk_size=1024*100, proxy=None, cookies=None, user_agent=None, logger=None, progress='mill', num_pools=20, pool_maxsize=20)
Create and initialize a BDownloader
object for executing download jobs.
-
The
max_workers
parameter specifies the number of the parallel downloading threads, whose default value is determined by #num_of_processor * 5. -
min_split_size
denotes the size in bytes of file pieces split to be downloaded in parallel, which defaults to 1024*1024 bytes (i.e. 1MB). -
The
chunk_size
parameter specifies the chunk size in bytes of every http range request, which will take a default value of 1024*100 (i.e. 100KB) if not provided. -
proxy
supports both HTTP and SOCKS proxies in the form of http://[user:pass@]host:port and socks5://[user:pass@]host:port, respectively. -
If
cookies
needs to be set, it must take the form of cookie_key=cookie_value, with multiple pairs separated by space character if applicable, e.g. 'key1=val1 key2=val2'. -
When
user_agent
is not given, it will default to 'bdownload/VERSION', with VERSION being replaced by the package's version number. -
The
logger
parameter specifies an event logger. Iflogger
is notNone
, it must be an object of typelogging.Logger
. Otherwise, it will use a default module-level logger returned bylogging.getLogger(__name__)
. -
progress
determines the style of the progress bar displayed while downloading files. Possible values are'mill'
and'bar'
, and'mill'
is the default. -
The
num_pools
parameter has the same meaning asnum_pools
inurllib3.PoolManager
and will eventually be passed to it. Specifically,num_pools
specifies the number of connection pools to cache. -
pool_maxsize
will be passed to the underlyingrequests.adapters.HTTPAdapter
. It specifies the maximum number of connections to save that can be reused in the urllib3 connection pool.
BDownloader.downloads(path_urls)
Submit multiple downloading jobs at a time.
path_urls
accepts a list of tuples of the form (path, url), where path should be a pathname, probably prefixed with absolute or relative paths, and url should be a URL string, which may consist of multiple TAB-separated URLs pointing to the same file. A validpath_urls
, for example, could be [('/opt/files/bar.tar.bz2', 'https://foo.cc/bar.tar.bz2'), ('./xfile.7z', 'https://bar.cc/xfile.7z\thttps://foo.cc/xfile.7z')].
BDownloader.download(path, url)
Submit a single downloading job.
- Similar to
BDownloader.downloads()
, in fact it is just a special case of which, with [(path, url)] composed of the specified parameters as the input.
BDownloader.close()
Wait for the jobs done and perform the cleanup.
Examples
- Single file downloading
import unittest
import tempfile
import os
import hashlib
from bdownload import BDownloader
class TestBDownloader(unittest.TestCase):
def setUp(self):
self.tmp_dir = tempfile.TemporaryDirectory()
def tearDown(self):
self.tmp_dir.cleanup()
def test_bdownloader_download(self):
file_path = os.path.join(self.tmp_dir.name, "aria2-x86_64-win.zip")
file_url = "https://github.com/Jesseatgao/aria2-patched-static-build/releases/download/1.35.0-win-linux/aria2-x86_64-win.zip"
file_sha1_exp = "16835c5329450de7a172412b09464d36c549b493"
with BDownloader(max_workers=20, progress='mill') as downloader:
downloader.download(file_path, file_url)
hashf = hashlib.sha1()
with open(file_path, mode='rb') as f:
hashf.update(f.read())
file_sha1 = hashf.hexdigest()
self.assertEqual(file_sha1_exp, file_sha1)
if __name__ == '__main__':
unittest.main()
- Batch file downloading
import unittest
import tempfile
import os
import hashlib
from bdownload import BDownloader
class TestBDownloader(unittest.TestCase):
def setUp(self):
self.tmp_dir = tempfile.TemporaryDirectory()
def tearDown(self):
self.tmp_dir.cleanup()
def test_bdownloader_downloads(self):
files = [
{
"file": os.path.join(self.tmp_dir.name, "aria2-x86_64-linux.tar.xz"),
"url": "https://github.com/Jesseatgao/aria2-patched-static-build/releases/download/1.35.0-win-linux/aria2-x86_64-linux.tar.xz",
"sha1": "d02dfdab7517e78a257f4403e502f1acc2a795e4"
},
{
"file": os.path.join(self.tmp_dir.name, "mkvtoolnix-x86_64-linux.tar.xz"),
"url": "https://github.com/Jesseatgao/MKVToolNix-static-builds/releases/download/v47.0.0-mingw-w64-win32v1.0/mkvtoolnix-x86_64-linux.tar.xz",
"sha1": "19b0c7fc20839693cc0929f092f74820783a9750"
}
]
file_urls = [(f["file"], f["url"]) for f in files]
with BDownloader(max_workers=20, progress='mill') as downloader:
downloader.downloads(file_urls)
for f in files:
hashf = hashlib.sha1()
with open(f["file"], mode='rb') as fd:
hashf.update(fd.read())
file_sha1 = hashf.hexdigest()
self.assertEqual(f["sha1"], file_sha1)
if __name__ == '__main__':
unittest.main()
Usage: as a command-line script
Synopsis
bdownload [-h] -o OUTPUT [OUTPUT ...] --url URL [URL ...] [-D DIR]
[-p PROXY] [-n MAX_WORKERS] [-k MIN_SPLIT_SIZE]
[-s CHUNK_SIZE] [-e COOKIE] [--user-agent USER_AGENT]
[-P {mill,bar}] [--num-pools NUM_POOLS]
[--pool-size POOL_SIZE]
Description
-o OUTPUT [OUTPUT ...], --output OUTPUT [OUTPUT ...]
one or more file names, e.g. -o file1.zip ~/file2.tgz
, paired with URLs specified by --url
--url URL [URL ...]
URL(s) for the files to be downloaded, which might be TAB-separated URLs pointing to the same file, e.g. --url https://yoursite.net/yourfile.7z
, --url "https://yoursite01.net/thefile.7z\thttps://yoursite02.com/thefile.7z"
, or --url "http://foo.cc/file1.zip" "http://bar.cc/file2.tgz\thttp://bar2.cc/file2.tgz"
-D DIR, --dir DIR
path to save the downloaded files
-p PROXY, --proxy PROXY
proxy in the form of "http://[user:pass@]host:port" or "socks5://[user:pass@]host:port"
-n MAX_WORKERS, --max-workers MAX_WORKERS
number of worker threads [default: 20]
-k MIN_SPLIT_SIZE, --min-split-size MIN_SPLIT_SIZE
file split size in bytes, "1048576, 1024K or 2M" for example [default: 1M]
-s CHUNK_SIZE, --chunk-size CHUNK_SIZE
every request range size in bytes, "10240, 10K or 1M" for example [default: 100K]
-e COOKIE, --cookie COOKIE
cookies in the form of "cookie_key=cookie_value cookie_key2=cookie_value2"
--user-agent USER_AGENT
custom user agent
-P {mill,bar}, --progress {mill,bar}
progress indicator [default: mill]
--num-pools NUM_POOLS
number of connection pools [default: 20]
--pool-size POOL_SIZE
max number of connections in the pool [default: 20]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.