Skip to main content

Parallel Iteration with File-Based Coordination

Project description

Logo

Laufband: Embarrassingly parallel, embarrassingly simple!

codecov

Laufband enables parallel iteration over a dataset from multiple processes, utilizing file-based locking and communication to ensure each item is processed exactly once.

Installation

Install Laufband using pip:

pip install laufband

Usage

Using Laufband is similar to the familiar tqdm progress bar for sequential iteration.

from laufband import Laufband

data = list(range(100))
for item in Laufband(data):
    # Process each item in the dataset
    pass

The true power of Laufband emerges when you run your script in parallel. Multiple processes will coordinate using file-based locking to ensure that each item in the dataset is processed by only one process.

Here's a typical example demonstrating parallel processing with Laufband and file-based locking for shared resource access:

import json
import time
from pathlib import Path
from laufband import Laufband

output_file = Path("data.json")
output_file.write_text(json.dumps({"processed_data": []}))
data = list(range(100))

worker = Laufband(data, desc="using Laufband")

for item in worker:
    # Simulate some computationally intensive task
    time.sleep(0.1)
    with worker.lock:
        # Access and modify a shared resource (e.g., a file) safely using the lock
        file_content = json.loads(output_file.read_text())
        file_content["processed_data"].append(item)
        output_file.write_text(json.dumps(file_content))

To execute this script (main.py) in parallel, you can use a command like the following in your terminal (this example launches 10 background processes):

for i in {1..10} ; do python main.py & done

[!IMPORTANT] The different processes may finish at different times. Therefore, the order of items in file_content is not guaranteed. If the order is important, you will need to implement sorting logic afterwards.

Failure Policy

In Laufband, a job will be automatically marked as failed if the iteration is interrupted by:

  • an unhandled Exception
  • or an explicit break.
from laufband import Laufband

data = list(range(100))

# Example 1: break
for item in Laufband(data):
    if item == 50:
        break  # Job 50 will be marked as failed

# Example 2: Exception
for item in Laufband(data):
    if item == 70:
        raise ValueError("Something went wrong")  # Job 70 will be marked as failed

If you want to exit early but still mark the job as successfully completed, you should use Laufband.close() instead of break:

from laufband import Laufband

data = list(range(100))

worker = Laufband(data)

for item in worker:
    if item == 50:
        worker.close()  # Job 50 will be marked as completed, and iteration will stop cleanly

Examples

ASE Calculator

For atomistic data, the ASE package is widely used to calculate energies and forces of atomic configurations using either ab initio methods or machine-learned interatomic potentials (MLIPs).

You can use Laufband to parallelize these calculations easily without duplication or manual bookkeeping and automatic checkpointing.

The following example uses a MACE foundation model to compute energies and forces on the ASE S22 dataset.

[!TIP] You can safely run this script multiple times — even across multiple SLURM jobs — without any modifications. Laufband will automatically coordinate which configurations are processed. For local parallelization, you can use bash: for i in {1..10} ; do python main.py & done

import ase.io
from ase.collections import s22
from laufband import Laufband
from mace.calculators import mace_mp

# Initialize calculator
calc = mace_mp(model="medium", dispersion=False, default_dtype="float32")

worker = Laufband(list(s22))

for atoms in worker:
    atoms.calc = calc
    atoms.get_potential_energy()
    with worker.lock:
        ase.io.write("frames.xyz", atoms, append=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

laufband-0.1.6.tar.gz (107.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

laufband-0.1.6-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file laufband-0.1.6.tar.gz.

File metadata

  • Download URL: laufband-0.1.6.tar.gz
  • Upload date:
  • Size: 107.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for laufband-0.1.6.tar.gz
Algorithm Hash digest
SHA256 eea00468ac49d8f7430cb825c8a78f2784bec31d0751201d6ab9f28c6b653c36
MD5 90a5efeb7c7117ec6933e8e0d077d43a
BLAKE2b-256 f3b8c3292de4fbf9503b1cb1e0276044b89df468ae8ed7b580727866adef84d0

See more details on using hashes here.

File details

Details for the file laufband-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: laufband-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for laufband-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 77a76cce2fa9fee914645f8cd064377b287eb5116177db92c1cdef232f40c73a
MD5 03291c6b059ff2230fc2a9eb13605e3d
BLAKE2b-256 18e0a8474114fc103c741488735b205208af2a7a7e3a2f5c02a395c7fac27985

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page