Mock OCEL 2.0 event log generator for LangChain multi-agent runs
Project description
ocelgen — Open Agent Traces Dataset Generator
Generate realistic multi-agent workflow trace datasets with LLM-enriched content. Built for the AI agent ecosystem.
The problem
Real agent traces are scarce. Production multi-agent systems generate rich execution data — LLM prompts, tool calls, agent reasoning, handoff messages — but these traces are proprietary and rarely shared. Teams building agent observability, evaluation, and debugging tools lack open datasets to develop against.
The solution
ocelgen generates structurally valid, semantically rich agent traces that look and feel like real multi-agent executions:
- Full trace content — LLM prompts and completions, tool call inputs/outputs, agent reasoning, inter-agent messages
- 10 enterprise domains — customer support, code review, incident response, financial analysis, and 6 more (plus custom domains via YAML)
- 3 workflow patterns — sequential, supervisor/worker, parallel fan-out/fan-in
- Labeled deviations — 10 types of anomalies (wrong tools, skipped steps, timeouts) with ground-truth annotations
- OCEL 2.0 standard — compatible with process mining tools (PM4Py, Celonis)
- Any LLM backend — OpenRouter, OpenAI, Anthropic, local models via OpenAI-compatible API
Quick start
pip install open-agent-traces
Development setup
git clone https://github.com/juliensimon/ocel-generator.git && cd ocel-generator
uv sync
LLM setup
Enrichment requires an OpenAI-compatible endpoint. Pick one:
Cloud (OpenRouter, OpenAI, etc.)
export OPENAI_API_KEY="your-key"
# Default: OpenRouter with Gemini Flash. Override with --model:
ocelgen enrich output.jsonocel -d customer-support-triage --model anthropic/claude-sonnet-4
Local (llama.cpp, Ollama, vLLM, etc.)
# Example: start llama.cpp with auto-download from Hugging Face
llama-server -hfr unsloth/Qwen3-30B-A3B-GGUF:Q6_K -ngl 99 -c 4096
# Point ocelgen at the local endpoint (no API key needed)
ocelgen enrich output.jsonocel -d customer-support-triage \
--model unsloth/Qwen3-30B-A3B-GGUF:Q6_K \
--base-url http://localhost:8080/v1
Generate and enrich
# Generate traces
ocelgen generate --pattern sequential --runs 50 --noise 0.2
# Enrich with LLM-generated content
ocelgen enrich output.jsonocel --domain customer-support-triage
# Or run the full pipeline (generate + enrich + upload to HF)
ocelgen pipeline --domain customer-support-triage --namespace your-hf-username
# Use custom domains defined in YAML
ocelgen pipeline --domain my-domain --config domains.yaml --namespace your-hf-username
Use the pre-built dataset
Skip generation — load the dataset directly from Hugging Face:
from datasets import load_dataset
ds = load_dataset("juliensimon/open-agent-traces", "incident-response")
for event in ds["train"]:
if event["run_id"] == "run-0000":
print(f"{event['event_type']:25s} | {event['agent_role']:12s} | {event['reasoning'][:60] if event['reasoning'] else ''}")
10 domains available: customer-support-triage · code-review-pipeline · market-research · legal-document-analysis · data-pipeline-debugging · content-generation · financial-analysis · incident-response · academic-paper-review · ecommerce-product-enrichment
Who is this for?
- Agent observability teams — build dashboards with realistic trace data (timestamps, token counts, costs)
- ML researchers — train anomaly detectors on labeled conformant vs deviant traces
- Process mining researchers — apply OCEL 2.0 conformance checking to agent workflows
- Agent framework developers — test LangGraph, CrewAI, AutoGen, Smolagents against realistic traces
- Evaluation teams — benchmark agent reasoning quality across domains and architectures
Documentation
- Quick Start — first dataset in 5 minutes
- User Guide — CLI reference, patterns, domains, custom YAML config, model configuration
- Dataset on Hugging Face — 17,000+ events, ready to use
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file open_agent_traces-0.1.0.tar.gz.
File metadata
- Download URL: open_agent_traces-0.1.0.tar.gz
- Upload date:
- Size: 465.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a83fae97c2022277eb6573ba9528927c43f27adb26e63c9acb3eef683472970
|
|
| MD5 |
66a897ca43c990c0c1839f37962ecdea
|
|
| BLAKE2b-256 |
821e95b7a1e43eb5a414cd925e5fa880ba31f5dd8cf79114d052ce33fa179344
|
Provenance
The following attestation bundles were made for open_agent_traces-0.1.0.tar.gz:
Publisher:
publish.yml on juliensimon/ocel-generator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
open_agent_traces-0.1.0.tar.gz -
Subject digest:
8a83fae97c2022277eb6573ba9528927c43f27adb26e63c9acb3eef683472970 - Sigstore transparency entry: 1233382082
- Sigstore integration time:
-
Permalink:
juliensimon/ocel-generator@667b1bec7f822cf8f6e4bcd586faa74935f9a5c5 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/juliensimon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@667b1bec7f822cf8f6e4bcd586faa74935f9a5c5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file open_agent_traces-0.1.0-py3-none-any.whl.
File metadata
- Download URL: open_agent_traces-0.1.0-py3-none-any.whl
- Upload date:
- Size: 61.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40c3f0892de7b85c77ae984fe4965277cc5f7417da8aeed9254537168cb0c8bf
|
|
| MD5 |
f1f075aa158bb1974648b1e24948d965
|
|
| BLAKE2b-256 |
653b53de8a16cf5b02dfc522810dc1a65a61c0d9278571790c3e13bda7d7c31a
|
Provenance
The following attestation bundles were made for open_agent_traces-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on juliensimon/ocel-generator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
open_agent_traces-0.1.0-py3-none-any.whl -
Subject digest:
40c3f0892de7b85c77ae984fe4965277cc5f7417da8aeed9254537168cb0c8bf - Sigstore transparency entry: 1233382091
- Sigstore integration time:
-
Permalink:
juliensimon/ocel-generator@667b1bec7f822cf8f6e4bcd586faa74935f9a5c5 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/juliensimon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@667b1bec7f822cf8f6e4bcd586faa74935f9a5c5 -
Trigger Event:
push
-
Statement type: