AntFlow: Async execution library with concurrent.futures-style API and advanced pipelines

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

rodolfonobrega

These details have not been verified by PyPI

Project description

AntFlow Logo

AntFlow

Why AntFlow?

The name 'AntFlow' is inspired by the efficiency of an ant colony, where each ant (worker) performs its specialized function, and together they contribute to the colony's collective goal. Similarly, AntFlow orchestrates independent workers to achieve complex asynchronous tasks seamlessly.

The Problem I Had to Solve

I was processing massive amounts of data using OpenAI's Batch API. The workflow was complex:

Upload batches of data to OpenAI
Wait for processing to complete
Download the results
Save to database
Repeat for the next batch

Initially, I processed 10 batches at a time using basic async. But here's the problem: I had to wait for ALL 10 batches to complete before starting the next group.

The Bottleneck

Imagine this scenario:

9 batches complete in 5 minutes
1 batch gets stuck and takes 30 minutes
I waste 25 minutes waiting for that one slow batch while my system sits idle

With hundreds of batches to process, these delays accumulated into hours of wasted time. Even worse, one failed batch would block the entire pipeline.

The Solution: AntFlow

I built AntFlow to solve this exact problem. Instead of batch-by-batch processing, AntFlow uses worker pools where:

✅ Each worker handles tasks independently
✅ When a worker finishes, it immediately grabs the next task
✅ Slow tasks don't block fast ones
✅ Always maintain optimal concurrency (e.g., 10 tasks running simultaneously)
✅ Built-in retry logic for failed tasks
✅ Multi-stage pipelines for complex workflows

Result: My OpenAI batch processing went from taking hours to completing in a fraction of the time, with automatic retry handling and zero idle time.

AntFlow Workers

AntFlow: Modern async execution library with concurrent.futures-style API and advanced pipelines

Key Features

🚀 Worker Pool Architecture

Independent workers that never block each other
Automatic task distribution
Optimal resource utilization

🔄 Multi-Stage Pipelines

Chain operations with configurable worker pools per stage
Each stage runs independently
Data flows automatically between stages
Priority Queues: Assign priority to items to bypass sequential processing (NEW)
Interactive Control: Resume pipelines and inject items into any stage (NEW)

💪 Built-in Resilience

Per-task retry with exponential backoff
Per-stage retry for transactional operations
Failed tasks don't stop the pipeline

📊 Real-time Monitoring & Dashboards

Built-in Progress Bar - Simple progress=True flag for terminal progress
Three Dashboard Levels - Compact, Detailed, and Full dashboards
Custom Dashboards - Implement DashboardProtocol for your own UI
Worker State Tracking - Know what each worker is doing in real-time
Performance Metrics - Track items processed, failures, avg time per worker
Error Summary - Aggregated error statistics with get_error_summary()
StatusTracker - Real-time item tracking with full history

🎯 Familiar API

Drop-in async replacement for concurrent.futures
submit(), map(), as_completed() methods
Clean, intuitive interface

✨ Fluent APIs (NEW)

Pipeline.quick() - One-liner for simple pipelines
Pipeline.create() - Fluent builder pattern
Result Streaming - pipeline.stream() for processing results as they complete

Use Cases

✅ Perfect for:

Batch API Processing - OpenAI, Anthropic, any batch API
ETL Pipelines - Extract, transform, load at scale
Web Scraping - Fetch, parse, store web data efficiently
Data Processing - Process large datasets with retry logic
Microservices - Chain async service calls with error handling

⚡ Real-world Impact:

Process large batches without bottlenecks
Automatic retry for transient failures
Zero idle time = maximum throughput
Clear observability with metrics and callbacks

Quick Install

pip install AntFlow

Quick Start

AntFlow offers three equivalent ways to create pipelines. Choose based on your needs:

Method 1: Fluent Builder API (Concise & Recommended)

import asyncio
from antflow import Pipeline

async def fetch(x):
    await asyncio.sleep(0.1)
    return f"data_{x}"

async def main():
    items = range(10)
    results = await (
        Pipeline.create()
        .add("Fetch", fetch, workers=5, retries=3)
        .run(items, progress=True)
    )
    print(f"Processed {len(results)} items")

if __name__ == "__main__":
    asyncio.run(main())

Method 2: Stage Objects (Full Control)

import asyncio
from antflow import Pipeline, Stage

async def process(x):
    await asyncio.sleep(0.1)
    return x * 2

async def main():
    items = range(10)
    stage = Stage(name="Process", workers=5, tasks=[process])
    pipeline = Pipeline(stages=[stage])
    results = await pipeline.run(items, progress=True)
    print(f"Processed {len(results)} items")

if __name__ == "__main__":
    asyncio.run(main())

Method 3: Quick One-Liner

import asyncio
from antflow import Pipeline

async def simple_task(x):
    return x + 1

async def main():
    results = await Pipeline.quick(range(10), simple_task, workers=5, progress=True)
    print(f"Processed {len(results)} items")

if __name__ == "__main__":
    asyncio.run(main())

Which Method to Choose?

Method	When to Use
Stage objects	Fine-grained control, custom callbacks, task concurrency limits
Fluent API	Clean multi-stage pipelines, quick prototyping
Pipeline.quick()	Simple scripts, single-task processing

All three methods produce the same result - they're just different ways to express the same thing.

Built-in Progress & Dashboards

All display options are optional. By default, pipelines run silently.

import asyncio
from antflow import Pipeline

async def task(x):
    await asyncio.sleep(0.01)
    return x * 2

async def main():
    items = range(50)
    # Dashboard options: "compact", "detailed", "full"
    results = await Pipeline.quick(items, task, workers=5, dashboard="detailed")

if __name__ == "__main__":
    asyncio.run(main())

Tip: For multi-stage pipelines, use dashboard="detailed" to see progress per stage and identify bottlenecks.

Stream Results

Process results as they complete:

import asyncio
from antflow import Pipeline

async def process(x):
    await asyncio.sleep(0.1)
    return f"result_{x}"

async def main():
    pipeline = Pipeline.create().add("Process", process, workers=5).build()
    
    async for result in pipeline.stream(range(10)):
        print(f"Got: {result.value}")

if __name__ == "__main__":
    asyncio.run(main())

Traditional API

For full control, use the traditional Stage and Pipeline API:

import asyncio
from antflow import Pipeline, Stage

async def upload_batch(batch_data):
    await asyncio.sleep(0.1)
    return "batch_id"

async def check_status(batch_id):
    await asyncio.sleep(0.1)
    return "result_url"

async def download_results(result_url):
    await asyncio.sleep(0.1)
    return "processed_data"

async def save_to_db(processed_data):
    await asyncio.sleep(0.1)
    return "saved"

async def main():
    # Build the pipeline with explicit stages
    upload_stage = Stage(name="Upload", workers=10, tasks=[upload_batch])
    check_stage = Stage(name="Check", workers=10, tasks=[check_status])
    download_stage = Stage(name="Download", workers=10, tasks=[download_results])
    save_stage = Stage(name="Save", workers=5, tasks=[save_to_db])

    pipeline = Pipeline(stages=[upload_stage, check_stage, download_stage, save_stage])

    # Process with progress bar
    batches = ["batch1", "batch2", "batch3"]
    results = await pipeline.run(batches, progress=True)
    print(f"Results: {len(results)} items")

if __name__ == "__main__":
    asyncio.run(main())

What happens: Each stage has its own worker pool. Workers process tasks independently. As soon as a worker finishes, it picks the next task. No waiting. No idle time. Maximum throughput.

Core Concepts

AsyncExecutor: Simple Concurrent Execution

For straightforward parallel processing, AsyncExecutor provides a concurrent.futures-style API:

import asyncio
from antflow import AsyncExecutor

async def process_item(x):
    await asyncio.sleep(0.1)
    return x * 2

async def main():
    async with AsyncExecutor(max_workers=10) as executor:
        # Using map() - returns list directly (like list(executor.map(...)) in concurrent.futures)
        # retries=3 means it will try up to 4 times total with exponential backoff
        results = await executor.map(process_item, range(100), retries=3)
        print(f"Processed {len(results)} items")

asyncio.run(main())

Pipeline: Multi-Stage Processing

For complex workflows with multiple steps, you can build a Pipeline:

import asyncio
from antflow import Pipeline, Stage

async def fetch(x):
    await asyncio.sleep(0.1)
    return f"data_{x}"

async def process(x):
    await asyncio.sleep(0.1)
    return x.upper()

async def save(x):
    await asyncio.sleep(0.1)
    return f"saved_{x}"

async def main():
    # Define stages with different worker counts
    fetch_stage = Stage(
        name="Fetch",
        workers=10,
        tasks=[fetch],
        # Limit specific tasks to avoid rate limits
        task_concurrency_limits={"fetch": 2}
    )
    
    process_stage = Stage(name="Process", workers=5, tasks=[process])
    save_stage = Stage(name="Save", workers=3, tasks=[save])

    # Build and run pipeline
    pipeline = Pipeline(stages=[fetch_stage, process_stage, save_stage])
    results = await pipeline.run(range(50), progress=True)

    print(f"Completed: {len(results)} items")
    print(f"Stats: {pipeline.get_stats()}")

if __name__ == "__main__":
    asyncio.run(main())

Why different worker counts?

Fetch: I/O bound, use more workers (10)
Process: CPU bound, moderate workers (5)
Save: Rate-limited API, fewer workers (3)

Real-Time Monitoring with StatusTracker

Track every item as it flows through your pipeline with StatusTracker. Get real-time status updates, query current states, and access complete event history.

from antflow import Pipeline, Stage, StatusTracker
import asyncio

# Mock tasks
async def fetch(x): return x
async def process(x): return x * 2
async def save(x): return x

# 1. Define a callback for real-time updates
async def log_event(event):
    print(f"Item {event.item_id}: {event.status} @ {event.stage}")

tracker = StatusTracker(on_status_change=log_event)

# Define stages
stage1 = Stage(name="Fetch", workers=5, tasks=[fetch])
stage2 = Stage(name="Process", workers=3, tasks=[process])
stage3 = Stage(name="Save", workers=5, tasks=[save])

pipeline = Pipeline(
    stages=[stage1, stage2, stage3],
    status_tracker=tracker
)

# 2. Run pipeline (logs will print in real-time)
async def main():
    items = range(50)
    results = await pipeline.run(items)

    # 3. Get final statistics
    stats = tracker.get_stats()
    print(f"Completed: {stats['completed']}")
    print(f"Failed: {stats['failed']}")

    # Get full history for an item
    history = tracker.get_history(item_id=0)

asyncio.run(main())

See the examples/ directory for more advanced usage, including built-in dashboards (dashboard="compact", "detailed", "full") and a Web Dashboard example (examples/web_dashboard/).

Monitoring: Dashboard vs StatusTracker

AntFlow provides two complementary monitoring mechanisms:

Dashboard (Polling): Built-in visual monitoring with periodic updates. Perfect for interactive debugging and real-time progress visualization. See Dashboard Guide.
StatusTracker (Event-driven): Async callbacks invoked immediately on events. Ideal for logging to external systems, integrating with monitoring tools, and complete event history. See StatusTracker Guide.

See Monitoring Guide for a detailed comparison and examples of both mechanisms.

Documentation

AntFlow has comprehensive documentation to help you get started and master advanced features:

🚀 Getting Started

Quick Start Guide - Get up and running in minutes
Installation Guide - Installation instructions

📚 User Guides

AsyncExecutor Guide - Using the concurrent.futures-style API
Concurrency Control - Managing concurrency limits and semaphores
Pipeline Guide - Building multi-stage workflows
Monitoring Guide - Dashboard vs StatusTracker comparison
Dashboard Guide - Real-time monitoring and dashboards
Error Handling - Managing failures and retries
Worker Tracking - Monitoring individual workers

💡 Examples

Examples Index - Start Here: List of all 11+ example scripts
Basic Examples - Simple use cases to get started
Advanced Examples - Complex workflows and patterns

📖 API Reference

API Index - Complete API documentation
AsyncExecutor - Executor API reference
Pipeline - Pipeline API reference
StatusTracker - Status tracking and monitoring
Exceptions - Exception types
Types - Type definitions
Utils - Utility functions

You can also build and serve the documentation locally using mkdocs:

pip install mkdocs-material
mkdocs serve

Then open your browser to http://127.0.0.1:8000.

Requirements

Python 3.9+
tenacity >= 8.0.0

Note: For Python 3.9-3.10, the taskgroup backport is automatically installed.

Running Tests

To run the test suite, first install the development dependencies from the project root:

pip install -e ".[dev]"

Then, you can run the tests using pytest:

pytest

Contributing

Contributions are welcome! Please see our Contributing Guidelines.

License

MIT License - see LICENSE file for details.

Made with ❤️ to solve real problems in production

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

rodolfonobrega

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.8.3

May 23, 2026

0.8.2

May 22, 2026

0.8.1

May 22, 2026

0.8.0

May 21, 2026

0.7.3

Mar 12, 2026

0.7.2

Jan 15, 2026

0.7.1

Jan 15, 2026

0.7.0

Jan 15, 2026

0.6.0

Jan 2, 2026

0.5.0

Jan 2, 2026

0.4.1

Dec 18, 2025

0.4.0

Dec 10, 2025

0.3.6

Dec 2, 2025

0.3.5

Dec 2, 2025

0.3.4

Nov 29, 2025

0.3.3

Nov 29, 2025

0.3.2

Nov 28, 2025

0.3.1

Nov 28, 2025

0.3.0

Nov 27, 2025

0.2.1

Nov 12, 2025

0.2.0

Nov 12, 2025

0.1.0

Nov 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antflow-0.8.3.tar.gz (68.8 kB view details)

Uploaded May 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

antflow-0.8.3-py3-none-any.whl (46.9 kB view details)

Uploaded May 23, 2026 Python 3

File details

Details for the file antflow-0.8.3.tar.gz.

File metadata

Download URL: antflow-0.8.3.tar.gz
Upload date: May 23, 2026
Size: 68.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for antflow-0.8.3.tar.gz
Algorithm	Hash digest
SHA256	`019a20d2804c2c98d9c14b4cb50804ae92ef789d7b34722c94e6c6d0de95484d`
MD5	`26beb5b7fe628612f46e65905b655cbb`
BLAKE2b-256	`c9a628b75f097b6208cd785cab45c97dcec3b992fbd75a88af49d5483be7d778`

See more details on using hashes here.

Provenance

The following attestation bundles were made for antflow-0.8.3.tar.gz:

Publisher: pypi.yml on rodolfonobrega/AntFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: antflow-0.8.3.tar.gz
- Subject digest: 019a20d2804c2c98d9c14b4cb50804ae92ef789d7b34722c94e6c6d0de95484d
- Sigstore transparency entry: 1615551081
- Sigstore integration time: May 23, 2026
Source repository:
- Permalink: rodolfonobrega/AntFlow@ac83cfb8930027bc76dcc338384ef1d6e72568e8
- Branch / Tag: refs/tags/v0.8.3
- Owner: https://github.com/rodolfonobrega
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@ac83cfb8930027bc76dcc338384ef1d6e72568e8
- Trigger Event: release

File details

Details for the file antflow-0.8.3-py3-none-any.whl.

File metadata

Download URL: antflow-0.8.3-py3-none-any.whl
Upload date: May 23, 2026
Size: 46.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for antflow-0.8.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6cc6b6af1737216532fab8f98b6c549a7977a9e808b57f93331d0ed4a1a2fe37`
MD5	`84e84b0aa2be4bae50d927ecf7f08c87`
BLAKE2b-256	`30f6f94d162c7fb6e25e602e41bbb0d91ba3bb153761a276e5bed8c8db540d2c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for antflow-0.8.3-py3-none-any.whl:

Publisher: pypi.yml on rodolfonobrega/AntFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: antflow-0.8.3-py3-none-any.whl
- Subject digest: 6cc6b6af1737216532fab8f98b6c549a7977a9e808b57f93331d0ed4a1a2fe37
- Sigstore transparency entry: 1615551088
- Sigstore integration time: May 23, 2026
Source repository:
- Permalink: rodolfonobrega/AntFlow@ac83cfb8930027bc76dcc338384ef1d6e72568e8
- Branch / Tag: refs/tags/v0.8.3
- Owner: https://github.com/rodolfonobrega
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@ac83cfb8930027bc76dcc338384ef1d6e72568e8
- Trigger Event: release

AntFlow 0.8.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AntFlow

Why AntFlow?

The Problem I Had to Solve

The Bottleneck

The Solution: AntFlow

Key Features

🚀 Worker Pool Architecture

🔄 Multi-Stage Pipelines

💪 Built-in Resilience

📊 Real-time Monitoring & Dashboards

🎯 Familiar API

✨ Fluent APIs (NEW)

Use Cases

✅ Perfect for:

⚡ Real-world Impact:

Quick Install

Quick Start

Method 1: Fluent Builder API (Concise & Recommended)

Method 2: Stage Objects (Full Control)

Method 3: Quick One-Liner

Which Method to Choose?

Built-in Progress & Dashboards

Stream Results

Traditional API

Core Concepts

AsyncExecutor: Simple Concurrent Execution

Pipeline: Multi-Stage Processing

Real-Time Monitoring with StatusTracker

Monitoring: Dashboard vs StatusTracker

Documentation

🚀 Getting Started

📚 User Guides

💡 Examples

📖 API Reference

Requirements

Running Tests

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance