Skip to main content

A tool used for processing data in batches, such as in parallel processing.

Project description

English description follows Japanese.


リストを指定サイズのバッチに分割するシンプルなPythonユーティリティです。

概要

batch-div はリストを指定サイズの部分リスト(バッチ)に分割します。並列処理やバッチ処理など、データをまとめて処理したい場面で便利です。

itertools.batched(Python 3.12以降)と同様の機能を、古いバージョンのPythonでも利用できます。結果はすべてリストとして返されるため、外側・内側ともに len() が使え、進捗表示に適しています。ただし、すべてリストとしてメモリに展開するため、メモリに載りきらないほど巨大なデータには不向きです。

インストール

pip install batch-div

使い方

import batch_div

tasks = [0, 1, 2, 3, 4, 5, 6]

for batch in batch_div(tasks, 3):
    print(batch)

出力:

[0, 1, 2]
[3, 4, 5]
[6]

第1引数に分割したいリスト、第2引数にバッチサイズを指定します。リストは先頭から順にバッチサイズごとに分割され、割り切れない場合は最後のバッチに残りの要素が入ります。

活用例

import batch_div

items = list(range(100))

batches = batch_div(items, 10)
print(f"全バッチ数: {len(batches)}")  # -> 10

for i, batch in enumerate(batches):
    print(f"バッチ {i+1}/{len(batches)} を処理中 ({len(batch)} 件)")
    # ここに並列処理などを記述

ライセンス

CC0 1.0 Universal(パブリックドメイン)


A simple Python utility for splitting a list into batches of a specified size.

Overview

batch-div divides a list into sublists (batches) of a given size. It is useful when you want to process data in chunks — for example, in parallel or batch processing workflows.

Similar to itertools.batched (Python 3.12+), but works on older Python versions. All results are returned as lists, so len() is supported on both the outer and inner collections — handy for progress tracking. Note that because everything is materialized as lists, this tool is not suitable for extremely large datasets that don't fit in memory.

Installation

pip install batch-div

Usage

import batch_div

tasks = [0, 1, 2, 3, 4, 5, 6]

for batch in batch_div(tasks, 3):
    print(batch)

Output:

[0, 1, 2]
[3, 4, 5]
[6]

The first argument is the list to split, and the second is the batch size. The input is divided into consecutive chunks of that size, with the last batch containing the remaining elements if the list doesn't divide evenly.

Use Case Example

import batch_div

items = list(range(100))

batches = batch_div(items, 10)
print(f"Total batches: {len(batches)}")  # -> 10

for i, batch in enumerate(batches):
    print(f"Processing batch {i+1}/{len(batches)} ({len(batch)} items)")
    # Insert your parallel or batch processing here

License

CC0 1.0 Universal (Public Domain Dedication)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batch_div-0.0.4.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

batch_div-0.0.4-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file batch_div-0.0.4.tar.gz.

File metadata

  • Download URL: batch_div-0.0.4.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for batch_div-0.0.4.tar.gz
Algorithm Hash digest
SHA256 e884e367fc48b17d736380b713ed39666a8c16bc23cba823474ba4c8ce030af9
MD5 9d7beeb1823121d2d76954513e8dc75a
BLAKE2b-256 9650e6183917c155f1b67f5a33e13bb13917fdf851ad1c9ab8155f5aaa51e2a8

See more details on using hashes here.

File details

Details for the file batch_div-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: batch_div-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for batch_div-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 cb813d960af4402d1e8f59ec3413caec02ffcbd4edf366c4b65cc028554f18d4
MD5 349036d0a9ace57f9c0d46c23b6935d2
BLAKE2b-256 1c6fbc9ca2a081a764e3b5bfbe9ca5b97a5f768d0ac77f200781d8bfd7f35687

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page