Skip to main content

Parallel runtime unifying Python's concurrency ecosystem

Project description

VedaRT

Unified Parallel Runtime for Python

CI Status Coverage PyPI version Python versions License

VedaRT (Versatile Execution and Dynamic Adaptation Runtime) unifies Python's fragmented concurrency ecosystem—threads, processes, asyncio, and GPU—under a single adaptive, observable API inspired by Rust's Rayon.

Why VedaRT?

Python's concurrency landscape is fragmented:

  • asyncio - Great for I/O, terrible for CPU work
  • threading - Limited by GIL, unpredictable performance
  • multiprocessing - High overhead, serialization issues
  • Ray/Dask - Over-engineered for local workloads
  • CuPy/Numba - Manual GPU management, no CPU fallback

VedaRT solves this by providing:

One API for all execution modes
Automatic scheduling (threads/processes/GPU)
Zero setup - works out of the box
Deterministic replay for debugging
Observable with built-in telemetry

Features

🚀 Zero-Boilerplate Parallel Computing

No executor setup, no pool management. Just write your logic:

import vedart as veda

# Parallel iteration - automatically optimized
result = veda.par_iter(range(1000)).map(lambda x: x**2).sum()

🧠 Adaptive Scheduling

Automatic executor selection based on workload characteristics:

# I/O-bound → threads
results = veda.par_iter(urls).map(fetch_url).collect()

# CPU-bound → processes  
results = veda.par_iter(data).map(heavy_computation).collect()

# GPU-compatible → GPU
results = veda.par_iter(matrices).gpu_map(matrix_multiply).collect()

🎯 Type-Safe & Modern

Full type hints, mypy strict mode compliance:

from vedart import par_iter
from typing import List

def process_data(items: List[int]) -> List[int]:
    return par_iter(items).map(lambda x: x * 2).collect()

📊 Built-in Telemetry

Rich observability with zero configuration:

import vedart as veda

# Your parallel work
result = veda.par_iter(data).map(process).collect()

# Get metrics
metrics = veda.telemetry.snapshot()
print(f"Tasks executed: {metrics.tasks_executed}")
print(f"Avg latency: {metrics.avg_latency_ms}ms")

# Export to Prometheus
metrics.export_prometheus()  # Ready for Grafana

🔬 Deterministic Debugging

Reproduce bugs reliably with deterministic mode:

with veda.deterministic(seed=42):
    # Exact same execution order every time
    result = flaky_parallel_computation()

⚡ GPU Acceleration

Seamless GPU offload with automatic CPU fallback:

@veda.gpu
def matrix_multiply(A, B):
    return A @ B  # Runs on GPU if available, CPU otherwise

Quick Start

Basic Parallel Iteration

import vedart as veda

# Parallel map
result = veda.par_iter(range(1000)).map(lambda x: x**2).sum()
# Output: 332833500

# Map + filter chain
result = (
    veda.par_iter(range(100))
    .map(lambda x: x * 2)
    .filter(lambda x: x > 50)
    .collect()
)

# Fold/reduce operations
product = veda.par_iter([1, 2, 3, 4, 5]).fold(1, lambda acc, x: acc * x)
# Output: 120

Scoped Parallel Execution

from vedart import scope
import time

def slow_task(x):
    time.sleep(0.1)
    return x ** 2

# Spawn tasks in parallel scope
with scope() as s:
    futures = [s.spawn(slow_task, i) for i in range(5)]
    results = s.wait_all()  # [0, 1, 4, 9, 16]

Complex Data Pipeline

import vedart as veda

def preprocess(item):
    # CPU-bound preprocessing
    return item.lower().strip()

async def save_to_db(item):
    # I/O-bound async operation
    await db.save(item)

# Mixed execution modes in one pipeline
results = (
    veda.par_iter(raw_data)
    .map(preprocess)           # Parallel CPU work
    .async_map(save_to_db)     # Async I/O
    .collect()
)

GPU Acceleration

import vedart as veda
import numpy as np

@veda.gpu  # Automatically uses CuPy/Numba if available
def matrix_ops(A, B):
    return A @ B + A.T

A = np.random.rand(1000, 1000)
B = np.random.rand(1000, 1000)
result = matrix_ops(A, B)  # Runs on GPU, falls back to CPU

Installation

# Basic installation
pip install vedart

# With GPU support
pip install vedart[gpu]

# With telemetry
pip install vedart[telemetry]

# Everything
pip install vedart[all]

Core Concepts

Parallel Iterators

Rayon-style parallel iterators with lazy evaluation:

from vedart import par_iter

# Chaining operations
result = (
    par_iter(data)
    .map(transform)      # Transform each item
    .filter(predicate)   # Filter items
    .fold(init, reducer) # Reduce to single value
)

# Common operations
par_iter(items).sum()         # Sum all items
par_iter(items).count()       # Count items  
par_iter(items).max()         # Find maximum
par_iter(items).collect()     # Collect to list

Scoped Execution

Structured concurrency with automatic cleanup:

from vedart import scope

with scope() as s:
    # All spawned tasks finish before exiting scope
    f1 = s.spawn(task1)
    f2 = s.spawn(task2, arg1, arg2)
    results = s.wait_all()
# Guaranteed: all tasks complete, resources cleaned up

Configuration

Customize runtime behavior:

import vedart as veda

# Builder pattern
config = (
    veda.Config.builder()
    .num_threads(8)
    .num_processes(4)
    .enable_gpu(True)
    .telemetry(True)
    .build()
)

veda.init(config)

# Or use presets
veda.init(veda.Config.thread_only())  # Thread pool only
veda.init(veda.Config.adaptive())     # Full adaptive scheduling

Telemetry & Monitoring

Built-in metrics collection:

import vedart as veda

# Run your workload
result = veda.par_iter(data).map(process).collect()

# Get metrics snapshot
metrics = veda.telemetry.snapshot()

print(f"Tasks executed: {metrics.tasks_executed}")
print(f"Tasks failed: {metrics.tasks_failed}")
print(f"Avg latency: {metrics.avg_latency_ms}ms")
print(f"P99 latency: {metrics.p99_latency_ms}ms")
print(f"CPU usage: {metrics.cpu_utilization_percent}%")

# Export formats
metrics.export_json()        # JSON format
metrics.export_prometheus()  # Prometheus format

Deterministic Mode

Reproducible execution for testing:

import vedart as veda

# Same seed = same execution order
with veda.deterministic(seed=42):
    result1 = parallel_workload()

with veda.deterministic(seed=42):
    result2 = parallel_workload()

assert result1 == result2  # Always true

# Save execution trace
with veda.deterministic(seed=42, trace_file="debug.trace"):
    buggy_function()

# Replay exact execution
veda.replay("debug.trace")

Performance

Benchmarks vs Alternatives

Workload VedaRT Ray Dask asyncio threading
CPU-bound (uniform) 1.0x 0.9x 0.77x N/A 1.05x
CPU-bound (variable) 1.0x 0.67x 0.48x N/A 0.83x
I/O-bound 1.0x 1.11x N/A 1.05x 0.91x
GPU-accelerated 1.0x 0.83x 0.56x N/A 0.05x
Mixed workload 1.0x 0.71x 0.62x N/A 0.78x

Task spawn overhead: ~85ns per task
Memory overhead: <5% vs raw threading

Examples

Explore real-world examples in the examples/ directory:

Run any example:

python examples/01_hello_parallel.py

Documentation

Comparison with Alternatives

vs Ray

VedaRT: Zero setup, local-first design
Ray: Heavy dependencies, complex setup for local use

vs Dask

VedaRT: Lightweight, simple API
Dask: High memory overhead, scheduler complexity

vs asyncio

VedaRT: Works for both I/O and CPU workloads
asyncio: Only I/O-bound, steep learning curve

vs threading/multiprocessing

VedaRT: Automatic mode selection, adaptive scaling
stdlib: Manual pool management, no adaptation

Requirements

  • Python: 3.10, 3.11, 3.12, 3.13
  • OS: Linux, macOS, Windows
  • Optional:
    • CuPy or Numba for GPU support
    • psutil for system metrics (auto-installed)

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Run tests: pytest tests/
  4. Ensure code quality:
    ruff check src/vedart tests
    black src/vedart tests
    mypy src/vedart --strict
    
  5. Commit your changes (git commit -m 'feat: add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Development Setup

# Clone repository
git clone https://github.com/TIVerse/vedart.git
cd vedart

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Run full CI suite locally
./run_ci_tests.sh

Roadmap

v1.0 (Current) ✅

  • Adaptive scheduler (threads/processes/async/GPU)
  • Parallel iterators
  • Scoped execution
  • Telemetry and metrics
  • Deterministic mode

v1.1 (Planned)

  • Custom executor plugins
  • Advanced load balancing strategies
  • Distributed tracing integration

v2.0 (Future)

  • Multi-node distributed execution
  • Network-aware scheduling
  • Fault tolerance and checkpointing

Authors

TIVerse Team

Acknowledgments

Inspired by:

  • Rayon - Rust's data parallelism library
  • Ray - Distributed computing framework
  • Tokio - Async runtime design patterns

License

MIT License - See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vedart-1.0.0.tar.gz (42.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vedart-1.0.0-py3-none-any.whl (50.1 kB view details)

Uploaded Python 3

File details

Details for the file vedart-1.0.0.tar.gz.

File metadata

  • Download URL: vedart-1.0.0.tar.gz
  • Upload date:
  • Size: 42.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for vedart-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a51358ea42a77557f7c3858ec18cb8952b89c77bdfdc74a5d37fb00af06e6569
MD5 ee86c5fbf411288be091c689c2ca520b
BLAKE2b-256 60023ed673f54544782453601826d9e8158c04dbc66a160e83e87fa12bb164b9

See more details on using hashes here.

File details

Details for the file vedart-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: vedart-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 50.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for vedart-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 50f1b00a74692ac2d14c2e9d5fe7acaaaa7c1be69feaa0f14d8f5cecb7ce86e4
MD5 f1b68e6723f0292c0fdca27555d267ff
BLAKE2b-256 7688d6fe307c92f96b0ac725bc0009ad79aea25b9f390b384ed6c3b36f2dc7e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page