Skip to main content

Composable, cache-aware batch processing pipelines for LLMs, APIs, and dataset generation.

Project description

BatchFactory

Composable, cache‑aware batch processing pipelines for LLMs, APIs, and dataset generation.

Status: alpha – expect breaking changes. Names and APIs will shift rapidly.


Install

pip install batchfactory   # coming soon: pip install -U batchfactory

Quick start

import batchfactory as bf

project = bf.CacheFolder("example1", 1, 0, 0)
broker = bf.brokers.ConcurrentLLMCallBroker(project("cache/llm_broker.jsonl"))

# Build a small graph that rewrites passages into short English poems

g = (
    bf.ReadMarkdownLinesOp("./data/*.txt", "keyword", directory_str_field="directory")
    | bf.ShuffleOp(42)
    | bf.TakeFirstNOp(3)
    | bf.GenerateLLMRequestOp(
        'Rewrite the passage from "{directory}" titled "{keyword}" as a four-line English poem.',
        model="gpt-4o-mini@openai",
    )
    | bf.ConcurrentLLMCallOp(project("cache/llm_call1.jsonl"), broker)
    | bf.ExtractResponseTextOp()
    | bf.SaveJsonlOp(project("out/poems.jsonl"),output_fields=["keyword","text","directory"])
    | bf.PrintTextOp()
)

g = g.compile()
g.resume()
g.pump(dispatch_broker=True, reset_input=True)

Core ideas (WIP)

  • Batch‑centric: every op consumes/produces lists of Entry objects.
  • Atomic Ops: small, single‑purpose units you compose with | or wire().
  • Graph‑first: compile once, then resume() and pump() as external brokers finish work.
  • Cache‑mindful: transparent on‑disk ledgers allow you to pause and restart without recomputation.
  • Broker pattern: offload long‑running or external work (LLMs, search APIs, human labeling) to pluggable brokers.
  • Loop‑aware: native For, While, and If controls let graphs loop through subgraphs cleanly—ideal for multi-agent or staged workflows.

Roadmap

  • Stabilise op names
  • Replace OpGraph executor with pluggable schedulers
  • Public docs & more examples

© 2025 · MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batchfactory-0.1.1.tar.gz (22.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

batchfactory-0.1.1-py3-none-any.whl (31.0 kB view details)

Uploaded Python 3

File details

Details for the file batchfactory-0.1.1.tar.gz.

File metadata

  • Download URL: batchfactory-0.1.1.tar.gz
  • Upload date:
  • Size: 22.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for batchfactory-0.1.1.tar.gz
Algorithm Hash digest
SHA256 daf29c47bfca787f1c864334132ca67c2d5f63668ee32f9e4e8f69e9f7ccdccf
MD5 3e65abce460a16bceaad5402acecb6d3
BLAKE2b-256 db6af0bf8dfd77443ab16519b19f589958ff5e29df4b7826178b2a40a72c0e7f

See more details on using hashes here.

File details

Details for the file batchfactory-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: batchfactory-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 31.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for batchfactory-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f39fd8d7dc3c9f684b20563687bf4ab7ace573fe0dabc61c4493e92823173e9a
MD5 dd5dc90d40c5ea96196491bf987725e4
BLAKE2b-256 4c9425956e4ef4420b941ffa800f13ff62f078673c1ae75d3f8c7049768a7298

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page