bdownload

A multi-threaded aria2-like batch file downloading library for Python

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

bdownload

A multi-threaded aria2-like batch file downloading library for Python

Installation

via PyPI

pip install bdownload
from within source directory locally

pip install .

Note that you should git clone or download the source tarball (and unpack it of course) from the repository first

Usage: as a Python package

Importing

聽聽聽聽from bdownload import BDownloader

聽聽聽聽聽聽聽聽聽聽聽聽or

聽聽聽聽import bdownload

Signatures

class bdownload.BDownloader(max_workers=None, min_split_size=1024*1024, chunk_size=1024*100, proxy=None, cookies=None, user_agent=None, logger=None, progress='mill', num_pools=20, pool_maxsize=20)

聽聽聽聽Create and initialize a BDownloader object for executing download jobs.

The max_workers parameter specifies the number of the parallel downloading threads, whose default value is determined by #num_of_processor * 5.
min_split_size denotes the size in bytes of file pieces split to be downloaded in parallel, which defaults to 1024*1024 bytes (i.e. 1MB).
The chunk_size parameter specifies the chunk size in bytes of every http range request, which will take a default value of 1024*100 (i.e. 100KB) if not provided.
proxy supports both HTTP and SOCKS proxies in the form of http://[user:pass@]host:port and socks5://[user:pass@]host:port, respectively.
If cookies needs to be set, it must take the form of cookie_key=cookie_value, with multiple pairs separated by space character if applicable, e.g. 'key1=val1 key2=val2'.
When user_agent is not given, it will default to 'bdownload/VERSION', with VERSION being replaced by the package's version number.
The logger parameter specifies an event logger. If logger is not None, it must be an object of type logging.Logger. Otherwise, it will use a default module-level logger returned by logging.getLogger(__name__).
progress determines the style of the progress bar displayed while downloading files. Possible values are 'mill' and 'bar', and 'mill' is the default.
The num_pools parameter has the same meaning as num_pools in urllib3.PoolManager and will eventually be passed to it. Specifically, num_pools specifies the number of connection pools to cache.
pool_maxsize will be passed to the underlying requests.adapters.HTTPAdapter. It specifies the maximum number of connections to save that can be reused in the urllib3 connection pool.

BDownloader.downloads(path_urls)

聽聽聽聽Submit multiple downloading jobs at a time.

path_urls accepts a list of tuples of the form (path, url), where path should be a pathname, probably prefixed with absolute or relative paths, and url should be a URL string, which may consist of multiple TAB-separated URLs pointing to the same file. A valid path_urls, for example, could be [('/opt/files/bar.tar.bz2', 'https://foo.cc/bar.tar.bz2'), ('./xfile.7z', 'https://bar.cc/xfile.7z\thttps://foo.cc/xfile.7z')].

BDownloader.download(path, url)

聽聽聽聽Submit a single downloading job.

Similar to BDownloader.downloads(), in fact it is just a special case of which, with [(path, url)] composed of the specified parameters as the input.

BDownloader.close()

聽聽聽聽Wait for the jobs done and perform the cleanup.

Examples

Single file downloading

import unittest
import tempfile
import os
import hashlib

from bdownload import BDownloader


class TestBDownloader(unittest.TestCase):
    def setUp(self):
        self.tmp_dir = tempfile.TemporaryDirectory()

    def tearDown(self):
        self.tmp_dir.cleanup()

    def test_bdownloader_download(self):
        file_path = os.path.join(self.tmp_dir.name, "aria2-x86_64-win.zip")
        file_url = "https://github.com/Jesseatgao/aria2-patched-static-build/releases/download/1.35.0-win-linux/aria2-x86_64-win.zip"
        file_sha1_exp = "16835c5329450de7a172412b09464d36c549b493"

        with BDownloader(max_workers=20, progress='mill') as downloader:
            downloader.download(file_path, file_url)

        hashf = hashlib.sha1()
        with open(file_path, mode='rb') as f:
            hashf.update(f.read())
        file_sha1 = hashf.hexdigest()

        self.assertEqual(file_sha1_exp, file_sha1)


if __name__ == '__main__':
    unittest.main()

Batch file downloading

import unittest
import tempfile
import os
import hashlib

from bdownload import BDownloader


class TestBDownloader(unittest.TestCase):
    def setUp(self):
        self.tmp_dir = tempfile.TemporaryDirectory()

    def tearDown(self):
        self.tmp_dir.cleanup()

    def test_bdownloader_downloads(self):
        files = [
            {
                "file": os.path.join(self.tmp_dir.name, "aria2-x86_64-linux.tar.xz"),
                "url": "https://github.com/Jesseatgao/aria2-patched-static-build/releases/download/1.35.0-win-linux/aria2-x86_64-linux.tar.xz",
                "sha1": "d02dfdab7517e78a257f4403e502f1acc2a795e4"
            },
            {
                "file": os.path.join(self.tmp_dir.name, "mkvtoolnix-x86_64-linux.tar.xz"),
                "url": "https://github.com/Jesseatgao/MKVToolNix-static-builds/releases/download/v47.0.0-mingw-w64-win32v1.0/mkvtoolnix-x86_64-linux.tar.xz",
                "sha1": "19b0c7fc20839693cc0929f092f74820783a9750"
            }
        ]

        file_urls = [(f["file"], f["url"]) for f in files]

        with BDownloader(max_workers=20, progress='mill') as downloader:
            downloader.downloads(file_urls)

        for f in files:
            hashf = hashlib.sha1()
            with open(f["file"], mode='rb') as fd:
                hashf.update(fd.read())
            file_sha1 = hashf.hexdigest()

            self.assertEqual(f["sha1"], file_sha1)


if __name__ == '__main__':
    unittest.main()

Usage: as a command-line script

Synopsis

bdownload [-h] -o OUTPUT [OUTPUT ...] --url URL [URL ...] [-D DIR]
                [-p PROXY] [-n MAX_WORKERS] [-k MIN_SPLIT_SIZE]
                [-s CHUNK_SIZE] [-e COOKIE] [--user-agent USER_AGENT]
                [-P {mill,bar}] [--num-pools NUM_POOLS]
                [--pool-size POOL_SIZE]

Description

-o OUTPUT [OUTPUT ...], --output OUTPUT [OUTPUT ...]

聽聽聽聽one or more file names, e.g. -o file1.zip ~/file2.tgz, paired with URLs specified by --url

--url URL [URL ...]

聽聽聽聽URL(s) for the files to be downloaded, which might be TAB-separated URLs pointing to the same file, e.g. --url https://yoursite.net/yourfile.7z, --url "https://yoursite01.net/thefile.7z\thttps://yoursite02.com/thefile.7z", or --url "http://foo.cc/file1.zip" "http://bar.cc/file2.tgz\thttp://bar2.cc/file2.tgz"

-D DIR, --dir DIR

聽聽聽聽path to save the downloaded files

-p PROXY, --proxy PROXY

聽聽聽聽proxy in the form of "http://[user:pass@]host:port" or "socks5://[user:pass@]host:port"

-n MAX_WORKERS, --max-workers MAX_WORKERS

聽聽聽聽number of worker threads [default: 20]

-k MIN_SPLIT_SIZE, --min-split-size MIN_SPLIT_SIZE

聽聽聽聽file split size in bytes, "1048576, 1024K or 2M" for example [default: 1M]

-s CHUNK_SIZE, --chunk-size CHUNK_SIZE

聽聽聽聽every request range size in bytes, "10240, 10K or 1M" for example [default: 100K]

-e COOKIE, --cookie COOKIE

聽聽聽聽cookies in the form of "cookie_key=cookie_value cookie_key2=cookie_value2"

--user-agent USER_AGENT

聽聽聽聽custom user agent

-P {mill,bar}, --progress {mill,bar}

聽聽聽聽progress indicator [default: mill]

--num-pools NUM_POOLS

聽聽聽聽number of connection pools [default: 20]

--pool-size POOL_SIZE

聽聽聽聽max number of connections in the pool [default: 20]

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.9

Apr 24, 2024

0.1.8

Apr 2, 2024

0.1.7

Mar 11, 2024

0.1.6

Mar 7, 2024

0.1.5

Jun 6, 2022

0.1.4

Feb 24, 2022

0.1.3

Jan 11, 2022

0.1.2

Dec 22, 2021

0.1.1

Nov 16, 2021

0.1.0

Apr 28, 2021

0.0.10

Mar 29, 2021

0.0.9

Feb 25, 2021

0.0.8

Feb 7, 2021

0.0.7

Jan 4, 2021

0.0.6

Nov 15, 2020

0.0.5

Oct 31, 2020

This version

0.0.4

Oct 30, 2020

0.0.3

Oct 10, 2020

0.0.2

Sep 29, 2020

0.0.1

Sep 19, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bdownload-0.0.4.tar.gz (18.7 kB view hashes)

Uploaded Oct 30, 2020 Source

Built Distribution

bdownload-0.0.4-py2.py3-none-any.whl (14.0 kB view hashes)

Uploaded Oct 30, 2020 Python 2 Python 3

Hashes for bdownload-0.0.4.tar.gz

Hashes for bdownload-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`079ad49317239ebf337868bdbab90ae7045c9315923de33aa1cb354756d155c7`
MD5	`fe7e9fe4400a367079090af7f702fcfd`
BLAKE2b-256	`84377431c0aecd54739a6d464be926aada04d2965f4d239184779b20abb69999`

Hashes for bdownload-0.0.4-py2.py3-none-any.whl

Hashes for bdownload-0.0.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`66c35ed45334ea9275cfc81eab215cee4becc7d41866b090952200f0c7d194e7`
MD5	`eb4dc781534d94775a138ae03b7f7bdc`
BLAKE2b-256	`985183ada3f60aefec16ca953fcba5d289244e4f806258a5d275d78404425229`