Semantiva: An HPC-ready, domain-driven, type-oriented framework that delivers semantic transparency to advanced scientific computing.

Project description

Semantiva

Overview

Semantiva is an open-source, Python-based framework that unifies Domain-Driven Design, Type-Oriented Development, and semantic transparency to streamline data operations. It offers a structured way to define and process domain-specific data types and algorithms, ensuring clarity, consistency, and adaptability even in complex data-driven scenarios.

By enforcing type-safe relationships between data and algorithms, Semantiva simplifies the creation of transparent, interpretable workflows—enabling teams to focus on solving domain problems rather than battling ambiguous data models. Semantiva also employs a dual-channel pipeline concept, where data and metadata context flow in parallel. This allows dynamic parameter injection—so each operation can fetch necessary parameters from a continuously evolving metadata context stream. Such an approach increases reusability (the same operation can be driven by different metadata to serve multiple use cases) and supports on-the-fly configuration changes without code rewrites.

Additionally, Semantiva is designed to be AI-compatible, allowing for collaboration with intelligent systems that can reason about, optimize, and even co-develop complex workflows using its well-defined semantic structures.

Install

pip install semantiva

Quickstart (create a local pipeline)

Create hello_pipeline.yaml with the following contents:

extensions: ["semantiva-examples"]

trace:
  driver: jsonl
  output_path: ./trace
  options:
    detail: all

pipeline:
  nodes:
    - processor: FloatValueDataSource
      parameters:
        value: 2.0
    - processor: FloatAddOperation
      parameters:
        addend: 1.0
    - processor: FloatMultiplyOperation
      parameters:
        factor: 10.0
    - processor: FloatCollectValueProbe
      context_key: "result"
    - processor: template:"result_{result}.txt":path
    - processor: FloatTxtFileSaver

Run the pipeline:
```
semantiva run hello_pipeline.yaml -v
```
This creates JSONL trace artifacts in ./trace and writes the computation result to result_<value>.txt using the configured processors.

Why Semantiva?

Semantiva is more than a pipeline runner—it is a semantic framework for reproducible, contract-validated data workflows. Pipelines compile into deterministic graphs with stable IDs, components are checked against a contract catalog, and payloads always flow with typed data and structured context. With first-class parametric sweeps and pluggable execution backends, Semantiva combines clarity, provenance, and flexibility in one framework.

Key Principles

Domain-Driven Design (DDD)
- Aligns data types, algorithms, and operations with core domain concepts.
- Ensures each module speaks a consistent “domain language,” reducing misunderstandings and promoting maintainability.
Type-Oriented Development
- Establishes robust contracts between data and operations.
- Minimizes errors by validating data structures at definition time, preventing mismatches or incompatible operations.
Semantic Transparency & AI-Readiness
- Retains full traceability of how data is transformed and why particular operations are invoked.

Features

Semantic Components: Processors, data types, and contexts are _SemantivaComponents with machine-readable metadata mapped to RDF predicates—ready for knowledge-graph integration.
Deterministic Graph Identity: Pipelines compile into canonical graphs with stable node/pipeline IDs (UUIDv5 + SHA256). Cosmetic YAML changes won’t alter provenance.
Contract-Validated Processors: A table-driven ruleset validates every component (input/output types, metadata, context usage). Results are diagnostics, not crashes.
Typed Payloads and Rich Context: Workflows operate on Payload(data, context), where data is a BaseDataType and context is a structured ContextType—ensuring safe parameter resolution and traceability.
Parametric Sweeps: Define systematic parameter grids in YAML (ranges, sequences, context-driven variables). Sweeps generate typed collections for experimentation without boilerplate.
Flexible Execution: Clean separation of concerns: Executor (how tasks run), Orchestrator (how the graph is traversed), Transport (how messages/events are passed). Ships with in-memory defaults, ready to scale out.
Tracing and Observability: Optional tracing emits before/after/error events with deterministic IDs. Zero overhead when disabled.
Modular & Extensible Architecture
- Supports adding new data types, processor types, and domain ontologies without disrupting existing components.
- Adapts naturally to diverse applications—ranging from basic string manipulations to advanced imaging pipelines or HPC-scale workloads.
- Allows intelligent agents to interact with and modify workflows dynamically, making it a natural fit for AI-assisted design and automation.

Benefits

Clarity & Consistency: Well-defined semantics for data and operations ensure that both humans and AI systems understand precisely how information flows and transforms.
Adaptive Workflows: Easily extend pipelines with new steps or data types, minimizing rework when domain requirements evolve.
Scalability & HPC Integration: A pipeline-oriented design lets users scale operations seamlessly, whether on local machines or high-performance clusters.
AI-Driven Collaboration: Structured metadata enables AI systems to assist with workflow optimizations, debugging, and dynamic pipeline generation.
Interdisciplinary Collaboration: A shared language of data and processor types fosters better communication across physics, mathematics, engineering, and software teams.
Dual-Channel Pipelines: Semantiva processes data and metadata context in parallel, enabling dynamic parameter injection and runtime adaptation without code rewrites.
Dynamic Parameter Injection: Parameters come from the context stream (not hardcoded), improving composability and reuse; change behavior without redeploys.
Advanced Reusability: Keep operations generic; put thresholds/routing/domain parameters in context to reduce duplication and enable mix-and-match pipelines.

AI-Enhanced Development Potential

Semantiva is not just an execution framework—it is also an AI-compatible co-design environment that enables advanced AI assistants to:

Understand Workflow Semantics: AI can analyze the framework’s structural metadata, reasoning about data flow, dependencies, and logical constraints.
Generate & Modify Pipelines: Given a high-level task description, AI can suggest or even implement workflow modifications that align with Semantiva’s principles.
Explain & Debug Operations: AI can trace execution paths, highlight inefficiencies, and generate human-readable explanations of complex workflows.
Enhance Cross-Domain Usability: By maintaining semantic clarity, AI systems can generalize Semantiva’s use cases across industries without needing deep domain-specific re-engineering.

This makes Semantiva uniquely suited to the evolving landscape of human-AI collaboration, ensuring that future AI-driven applications remain interpretable, adaptable, and semantically sound.

Core Components

Data Operations
- Abstract classes that enforce type-safe transformations, ensuring data flows remain coherent and domain-accurate.
Context Processors
- Manage contextual or environmental information affecting data processing, enhancing adaptability and domain awareness.
Pipelines
- Orchestrate the execution of multiple operations, combining data transformations and context adaptations into a coherent workflow.
- Semantiva pipelines propagate both data and metadata context in parallel, empowering operations to dynamically fetch parameters. This supports fluid, on-the-fly changes to how data is processed.
Data Types & Processor Types
- Define the structure and constraints of domain-specific data, alongside compatible operations (e.g., Image ↔ ImageOperation), guaranteeing semantic integrity.
Execution Tools
- Utilities for executing, monitoring, and debugging pipelines, supporting straightforward deployment and scaling.

Pipeline Configuration & Node Factories

Semantiva pipelines are defined declaratively via YAML. Users implement processors and register them; nodes are generated by factories based on these configurations. Class resolvers (e.g., slice:, rename:, delete:) and parameter resolvers (e.g., model:) resolve processors and parameters at load time, eliminating manual node instantiation.

Getting Started with Semantiva

Run from CLI

semantiva inspect semantiva/examples/simple_pipeline.yaml
semantiva inspect --extended semantiva/examples/simple_pipeline.yaml
semantiva run semantiva/examples/simple_pipeline.yaml --context experiment=AB42 --context seed=1234
semantiva run semantiva/examples/simple_pipeline.yaml
semantiva run semantiva/examples/simple_pipeline.yaml --validate
semantiva run semantiva/examples/simple_pipeline.yaml --dry-run

To quickly dive into Semantiva, explore the following resources:

Advanced Workflow Demo:
Check out the Semantiva Imaging repository for a detailed demo on designing advanced imaging pipelines.
Extended Documentation:
Visit api.semantiva.org for comprehensive reference material on Semantiva's architecture, principles, and usage.

These resources offer a practical roadmap to mastering the framework and leveraging its full potential in your projects.

License

Semantiva is released under the Apache License 2.0, promoting collaborative development and broad adoption.

Acknowledgments

This framework draws inspiration from the rigorous demands of transparency and traceability in data-driven systems, particularly exemplified by the ALICE O2 project at CERN. The lessons learned from managing large-scale, high-throughput data in that environment—combined with the need for robust, domain-aligned workflows—shaped Semantiva’s emphasis on type-safe design, semantic clarity, and modular extensibility. By blending these concepts with principles of ontology-driven computing, Semantiva aims to deliver the same level of reliability and interpretability for any domain requiring advanced data processing and HPC integration.

Project details

Release history Release notifications | RSS feed

0.5.0

Nov 15, 2025

This version

0.5.0rc11 pre-release

Nov 10, 2025

0.4.0

Jun 8, 2025

0.4.0rc1 pre-release

Jun 7, 2025

0.3.0

Mar 11, 2025

0.2.1.dev0 pre-release

Mar 11, 2025

0.2.0

Jan 27, 2025

0.1.1

Jan 21, 2025

0.1.0

Jan 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantiva-0.5.0rc11.tar.gz (253.1 kB view details)

Uploaded Nov 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

semantiva-0.5.0rc11-py3-none-any.whl (232.7 kB view details)

Uploaded Nov 10, 2025 Python 3

File details

Details for the file semantiva-0.5.0rc11.tar.gz.

File metadata

Download URL: semantiva-0.5.0rc11.tar.gz
Upload date: Nov 10, 2025
Size: 253.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: pdm/2.26.1 CPython/3.10.12 Linux/6.11.0-1018-azure

File hashes

Hashes for semantiva-0.5.0rc11.tar.gz
Algorithm	Hash digest
SHA256	`35f8fd7dd6827a1c5c63c789f50d09abee7dab4bc95096765ce54f765b7ca497`
MD5	`80f1f57a45b911665fe1f9a82ab51e94`
BLAKE2b-256	`83d931aa6d63ffe45e994165786df3ec49262827aed7d99f9b956264839fa6e0`

See more details on using hashes here.

File details

Details for the file semantiva-0.5.0rc11-py3-none-any.whl.

File metadata

Download URL: semantiva-0.5.0rc11-py3-none-any.whl
Upload date: Nov 10, 2025
Size: 232.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: pdm/2.26.1 CPython/3.10.12 Linux/6.11.0-1018-azure

File hashes

Hashes for semantiva-0.5.0rc11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2185f40ac06d642e9ce1629432877ace7077858bf8dc30351ece42041cacf410`
MD5	`db6d510d3af9a02f5fc1f473e794692f`
BLAKE2b-256	`e7d8083735a0fbf646bf3bf509f237fc27e0501e047ce5e31900c5bf6eb61121`

See more details on using hashes here.

semantiva 0.5.0rc11

Navigation

Verified details

Owner

Unverified details

Meta

Project description

Semantiva

Overview

Install

Quickstart (create a local pipeline)

Why Semantiva?

Key Principles

Features

Benefits

AI-Enhanced Development Potential

Core Components

Pipeline Configuration & Node Factories

Getting Started with Semantiva

Run from CLI

License

Acknowledgments

Project details

Verified details

Owner

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes