Skip to main content

A tool used for processing data in batches, such as in parallel processing.

Project description

English description follows Japanese.


リストやその他のイテラブルを指定サイズのバッチに分割するシンプルなPythonユーティリティです。

概要

batch-div はリストをはじめとする任意のイテラブルを、指定サイズの部分リスト(バッチ)に分割します。並列処理やバッチ処理など、データをまとめて処理したい場面で便利です。

itertools.batched(Python 3.12以降)と同様の機能を、古いバージョンのPythonでも利用できます。結果はすべてリストとして返されるため、外側・内側ともに len() が使え、進捗表示に適しています。ただし、すべてリストとしてメモリに展開するため、メモリに載りきらないほど巨大なデータには不向きです。

インストール

pip install batch-div

使い方

import batch_div

tasks = [0, 1, 2, 3, 4, 5, 6]
for batch in batch_div(tasks, 3):
    print(batch)

出力:

[0, 1, 2]
[3, 4, 5]
[6]

第1引数に分割したいイテラブル(リスト、タプル、ジェネレータなど)、第2引数にバッチサイズを指定します。入力は先頭から順にバッチサイズごとに分割され、割り切れない場合は最後のバッチに残りの要素が入ります。

活用例

import batch_div

items = list(range(100))
batches = batch_div(items, 10)
print(f"全バッチ数: {len(batches)}")  # -> 10

for i, batch in enumerate(batches):
    print(f"バッチ {i+1}/{len(batches)} を処理中 ({len(batch)} 件)")
    # ここに並列処理などを記述

ライセンス

CC0 1.0 Universal(パブリックドメイン)


A simple Python utility for splitting a list or any iterable into batches of a specified size.

Overview

batch-div divides a list — or any iterable — into sublists (batches) of a given size. It is useful when you want to process data in chunks — for example, in parallel or batch processing workflows.

Similar to itertools.batched (Python 3.12+), but works on older Python versions. All results are returned as lists, so len() is supported on both the outer and inner collections — handy for progress tracking. Note that because everything is materialized as lists, this tool is not suitable for extremely large datasets that don't fit in memory.

Installation

pip install batch-div

Usage

import batch_div

tasks = [0, 1, 2, 3, 4, 5, 6]
for batch in batch_div(tasks, 3):
    print(batch)

Output:

[0, 1, 2]
[3, 4, 5]
[6]

The first argument is any iterable to split (lists, tuples, generators, etc.), and the second is the batch size. The input is divided into consecutive chunks of that size, with the last batch containing the remaining elements if the input doesn't divide evenly.

Use Case Example

import batch_div

items = list(range(100))
batches = batch_div(items, 10)
print(f"Total batches: {len(batches)}")  # -> 10

for i, batch in enumerate(batches):
    print(f"Processing batch {i+1}/{len(batches)} ({len(batch)} items)")
    # Insert your parallel or batch processing here

License

CC0 1.0 Universal (Public Domain Dedication)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batch_div-0.1.0.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

batch_div-0.1.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file batch_div-0.1.0.tar.gz.

File metadata

  • Download URL: batch_div-0.1.0.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for batch_div-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7265352d05003856f8e9656a5212d1b40e2996ae55b9da3873fc23c14f2938ca
MD5 869bed4c7abddd8a4cc76cb78cd3bd13
BLAKE2b-256 8fa5cdf9e0a8e72416744535127f9650301a897bc01fe538fbe4c1c57c844a12

See more details on using hashes here.

File details

Details for the file batch_div-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: batch_div-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for batch_div-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 39b279d3b1c707a328c7f19c379a4b6bfc59a5a29348cce39831de1594ff8758
MD5 b7e1e047ebf7468d62c911fb89ea82c7
BLAKE2b-256 6cc19c15f945d4d154ff85cc52c2e8572b8e6d37050708b426186cd8fd9ea1da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page