`Tadween-core` is a modular, embedded micro-orchestrator designed for building complex, stateful processing pipelines from simple linear chains to full DAG-based workflows with zero external services required.
Project description
Tadween-core
Tadween-core is a modular, embedded micro-orchestrator designed for building complex, stateful processing pipelines from simple linear chains to full DAG-based workflows with zero external services required.
Quick Start
Tadween-core provides the orchestration logic and contracts for any asynchronous, stateful workflow.
from tadween_core.workflow import Workflow
from tadween_core.broker import InMemoryBroker
# 1. Initialize core components
broker = InMemoryBroker()
workflow = Workflow(broker=broker, name="MyCustomPipeline")
# 2. Add stages (Handlers + Policies)
workflow.add_stage("ingest", IngestHandler())
workflow.add_stage("process", AIHandler())
# 3. Link them into a DAG
workflow.link("ingest", "process")
workflow.set_entry_point("ingest")
# 4. Build and Run
workflow.build()
workflow.submit({"data": "source_uri"})
Explore the examples/ directory for complete, runnable use cases.
Core Philosophy
Tadween-core is built on the principle that every step of a pipeline should be optional, swappable, and "declarative." It provides the contracts and orchestration, while specific providers implement the computational logic.
- Composable: Build complex DAGs from simple, reusable stages.
- Environment Agnostic: Run locally, in production, or distributed.
- Type-Safe: Leveraging Pydantic for robust I/O and state management.
- Stateful: Integrated caching and persistence layers.
- Concurrency-First: Managed thread and process-based task queues resolve I/O and CPU bottlenecks by allowing stages to run in parallel where the DAG permits.
Use Tadween-core if you are building a high-performance Product that needs to process heavy data concurrently on a single node with minimal overhead.
System Architecture
The framework is composed of several independent but integrated modules:
| Component | Description | Documentation |
|---|---|---|
| Handler | The unit of execution (Computational or Operational). | docs |
| Task Queue | Manages async execution (Threads or Processes). | docs |
| Broker | The message bus for inter-stage communication. | docs |
| Cache | High-performance, type-safe caching system. | docs |
| Artifact | The core data model used in persistence layer. | docs |
| Repository | Persistence layer for Artifacts. | docs |
| Stage | Domain-related layer. Manages tq, cache, repo. managed by the workflow | docs |
| Workflow | Orchestrates the DAG and manages stage-level logic. | docs |
For detailed implementation details, please refer to the specific component documentation linked above.
Installation & Tests
Ensure you have uv installed.
# Run tests
uv run pytest
# Verbose output
uv run pytest -v -s
Logger
The library logger follows Python best practices and is silent by default. To quickly configure logging, use tadween_core.set_logger, which sets up a default logger configuration for the parent. Review the implementation here.
The root logger for the library uses the namespace tadween. And each major component logger use standard hierarchical naming, for example: tadween.cache, tadween.stage, tadween.repo, ...etc.
You may provide your own logger via dependency injection (DI), which is supported by most major components.
Custom Logger Integration
If you want to avoid interacting with the library’s logger hierarchy, you can either do:
- Use a logger with a different name (e.g.,
"my_custom_logger") - Disable propagation by setting:
logger.propagate = False
Make sure to attach handlers for disabled propagation loggers.
The Backstory
The story of tadween-core
Tadween began as a simple, monolithic end-to-end audio-to-text pipeline. The initial prototype was full of hardcoded bits, no clear standards, and no normalization — so any change felt like walking through a minefield. Building it revealed what a solid system really needs, especially when it must run in different environments: local machines, production servers, serverless platforms, or even distributed setups with many GPUs.
Early pipeline design was strictly stage-based: an ASR stage, then an LLM stage. The ASR stage was one big function with dozens of flags: should we run ASR only? ASR + alignment? ASR + alignment + diarization? The output was always the same Pydantic model, with many optional fields set to None. That result then needed to be normalized and cleaned of potential ASR hallucination artifacts before being fed into the LLM, which handled context reconstruction (audio is noisy and ASR quality isn't always perfect), insight extraction, and Q&A.
Because ASR and LLM work are expensive, we introduced an artifact model to track progress and let us retry or resume from where we left off.
The linear nature of the pipeline stages forced a choice: either pass the whole batch through each stage sequentially, or iterate over each artifact and process it one by one — also sequentially. Imagine a pipeline that first needs to download files (I/O) and then decompress them (CPU). In a linear setup you’d wait for all downloads to finish before starting decompression — slow and wasteful unless you write custom code to run I/O and CPU tasks concurrently.
Or imagine swapping out a diarization component: swapping pyannote for nvidia-toolkit should be simple, but it wasn’t.
Storage choices had the same friction. Do we save artifacts to the filesystem? That’s fine locally, but serverless often needs S3. In production you might prefer a database. These environment differences made it clear we needed something more flexible.
We decided to stop patching the pipeline and start building the engine we actually needed. We spent months "dividing and conquering" these pain points:
- We turned the monolithic pipeline into a Generic
Workflow Builder. - We abstracted the messaging into
Brokersand the execution intoTask Queues. - We abstracted the domain need by
Stage. - We abstracted persistence layer into
Repos. With built-in implementation for lazy-loading - We redesigned the sequential, dependent I/O stream into
StagePolicy.resolve_inputsin addition to typed caching system. - We redesigned the Artifact to be lazy-loaded and state-aware, moving from a single "God Object" to a modular entity with persistent "Parts."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tadween_core-0.1.0a2.tar.gz.
File metadata
- Download URL: tadween_core-0.1.0a2.tar.gz
- Upload date:
- Size: 269.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b38b1710180de5e519099ed65e6167e0e91ac92468a6f8f2106db738ef7e87c9
|
|
| MD5 |
d5791dc40130dcb5afbe3d3abc0441fe
|
|
| BLAKE2b-256 |
b34b0473796c698d253861c686b35d4e240c4ccfdc672f6220e84551f8f20164
|
File details
Details for the file tadween_core-0.1.0a2-py3-none-any.whl.
File metadata
- Download URL: tadween_core-0.1.0a2-py3-none-any.whl
- Upload date:
- Size: 114.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6120f803cd864627bd7e31e78be3d850baddfe05920939aaf6aefe8cc19b19aa
|
|
| MD5 |
c4a4f17c08e596613c516a9d1fa11da4
|
|
| BLAKE2b-256 |
9bad105cd91a6d5f6c92c183030dfcae715da46cdb098c9134780bee0016856a
|