Composable, cache-aware batch processing pipelines for LLMs, APIs, and dataset generation.
Project description
BatchFactory
Composable, cache‑aware batch processing pipelines for LLMs, APIs, and dataset generation.
Status: alpha – expect breaking changes. Names and APIs will shift rapidly.
Install
pip install batchfactory # coming soon: pip install -U batchfactory
Quick start
import batchfactory as bf
project = bf.CacheFolder("example1", 1, 0, 0)
broker = bf.brokers.ConcurrentLLMCallBroker(project("cache/llm_broker.jsonl"))
# Build a small graph that rewrites passages into short English poems
g = (
bf.ReadMarkdownLinesOp("./data/*.txt", "keyword", directory_str_field="directory")
| bf.ShuffleOp(42)
| bf.TakeFirstNOp(3)
| bf.GenerateLLMRequestOp(
'Rewrite the passage from "{directory}" titled "{keyword}" as a four-line English poem.',
model="gpt-4o-mini@openai",
)
| bf.ConcurrentLLMCallOp(project("cache/llm_call1.jsonl"), broker)
| bf.ExtractResponseTextOp()
| bf.SaveJsonlOp(project("out/poems.jsonl"),output_fields=["keyword","text","directory"])
| bf.PrintTextOp()
)
g = g.compile()
g.resume()
g.pump(dispatch_broker=True, reset_input=True)
Core ideas (WIP)
- Batch‑centric: every op consumes/produces lists of Entry objects.
- Atomic Ops: small, single‑purpose units you compose with
|orwire(). - Graph‑first: compile once, then
resume()andpump()as external brokers finish work. - Cache‑mindful: transparent on‑disk ledgers allow you to pause and restart without recomputation.
- Broker pattern: offload long‑running or external work (LLMs, search APIs, human labeling) to pluggable brokers.
- Loop‑aware: native
For,While, andIfcontrols let graphs loop through subgraphs cleanly—ideal for multi-agent or staged workflows.
Roadmap
- Stabilise op names
- Replace
OpGraphexecutor with pluggable schedulers - Public docs & more examples
© 2025 · MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
batchfactory-0.1.1.tar.gz
(22.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file batchfactory-0.1.1.tar.gz.
File metadata
- Download URL: batchfactory-0.1.1.tar.gz
- Upload date:
- Size: 22.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
daf29c47bfca787f1c864334132ca67c2d5f63668ee32f9e4e8f69e9f7ccdccf
|
|
| MD5 |
3e65abce460a16bceaad5402acecb6d3
|
|
| BLAKE2b-256 |
db6af0bf8dfd77443ab16519b19f589958ff5e29df4b7826178b2a40a72c0e7f
|
File details
Details for the file batchfactory-0.1.1-py3-none-any.whl.
File metadata
- Download URL: batchfactory-0.1.1-py3-none-any.whl
- Upload date:
- Size: 31.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f39fd8d7dc3c9f684b20563687bf4ab7ace573fe0dabc61c4493e92823173e9a
|
|
| MD5 |
dd5dc90d40c5ea96196491bf987725e4
|
|
| BLAKE2b-256 |
4c9425956e4ef4420b941ffa800f13ff62f078673c1ae75d3f8c7049768a7298
|