Skip to main content

Mock OCEL 2.0 event log generator for LangChain multi-agent runs

Project description

ocelgen — Open Agent Traces Dataset Generator

Generate realistic multi-agent workflow trace datasets with LLM-enriched content. Built for the AI agent ecosystem.

Dataset on HF PyPI CI License: MIT Python 3.11+ OCEL 2.0 OpenAI Compatible

Parallel workflow trace — market research domain

The problem

Real agent traces are scarce. Production multi-agent systems generate rich execution data — LLM prompts, tool calls, agent reasoning, handoff messages — but these traces are proprietary and rarely shared. Teams building agent observability, evaluation, and debugging tools lack open datasets to develop against.

The solution

ocelgen generates structurally valid, semantically rich agent traces that look and feel like real multi-agent executions:

  • Full trace content — LLM prompts and completions, tool call inputs/outputs, agent reasoning, inter-agent messages
  • 10 enterprise domains — customer support, code review, incident response, financial analysis, and 6 more (plus custom domains via YAML)
  • 3 workflow patterns — sequential, supervisor/worker, parallel fan-out/fan-in
  • Labeled deviations — 10 types of anomalies (wrong tools, skipped steps, timeouts) with ground-truth annotations
  • OCEL 2.0 standard — compatible with process mining tools (PM4Py, Celonis)
  • Any LLM backend — OpenRouter, OpenAI, Anthropic, local models via OpenAI-compatible API

Quick start

pip install open-agent-traces

Development setup

git clone https://github.com/juliensimon/ocel-generator.git && cd ocel-generator
uv sync

LLM setup

Enrichment requires an OpenAI-compatible endpoint. Pick one:

Cloud (OpenRouter, OpenAI, etc.)

export OPENAI_API_KEY="your-key"
# Default: OpenRouter with Gemini Flash. Override with --model:
ocelgen enrich output.jsonocel -d customer-support-triage --model anthropic/claude-sonnet-4

Local (llama.cpp, Ollama, vLLM, etc.)

# Example: start llama.cpp with auto-download from Hugging Face
llama-server -hfr unsloth/Qwen3-30B-A3B-GGUF:Q6_K -ngl 99 -c 4096

# Point ocelgen at the local endpoint (no API key needed)
ocelgen enrich output.jsonocel -d customer-support-triage \
  --model unsloth/Qwen3-30B-A3B-GGUF:Q6_K \
  --base-url http://localhost:8080/v1

Generate and enrich

# Generate traces
ocelgen generate --pattern sequential --runs 50 --noise 0.2

# Enrich with LLM-generated content
ocelgen enrich output.jsonocel --domain customer-support-triage

# Or run the full pipeline (generate + enrich + upload to HF)
ocelgen pipeline --domain customer-support-triage --namespace your-hf-username

# Use custom domains defined in YAML
ocelgen pipeline --domain my-domain --config domains.yaml --namespace your-hf-username

Use the pre-built dataset

Skip generation — load the dataset directly from Hugging Face:

from datasets import load_dataset

ds = load_dataset("juliensimon/open-agent-traces", "incident-response")

for event in ds["train"]:
    if event["run_id"] == "run-0000":
        print(f"{event['event_type']:25s} | {event['agent_role']:12s} | {event['reasoning'][:60] if event['reasoning'] else ''}")

10 domains available: customer-support-triage · code-review-pipeline · market-research · legal-document-analysis · data-pipeline-debugging · content-generation · financial-analysis · incident-response · academic-paper-review · ecommerce-product-enrichment

Who is this for?

  • Agent observability teams — build dashboards with realistic trace data (timestamps, token counts, costs)
  • ML researchers — train anomaly detectors on labeled conformant vs deviant traces
  • Process mining researchers — apply OCEL 2.0 conformance checking to agent workflows
  • Agent framework developers — test LangGraph, CrewAI, AutoGen, Smolagents against realistic traces
  • Evaluation teams — benchmark agent reasoning quality across domains and architectures

Documentation

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_agent_traces-0.1.0.tar.gz (465.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_agent_traces-0.1.0-py3-none-any.whl (61.2 kB view details)

Uploaded Python 3

File details

Details for the file open_agent_traces-0.1.0.tar.gz.

File metadata

  • Download URL: open_agent_traces-0.1.0.tar.gz
  • Upload date:
  • Size: 465.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for open_agent_traces-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8a83fae97c2022277eb6573ba9528927c43f27adb26e63c9acb3eef683472970
MD5 66a897ca43c990c0c1839f37962ecdea
BLAKE2b-256 821e95b7a1e43eb5a414cd925e5fa880ba31f5dd8cf79114d052ce33fa179344

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_agent_traces-0.1.0.tar.gz:

Publisher: publish.yml on juliensimon/ocel-generator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file open_agent_traces-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_agent_traces-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 40c3f0892de7b85c77ae984fe4965277cc5f7417da8aeed9254537168cb0c8bf
MD5 f1f075aa158bb1974648b1e24948d965
BLAKE2b-256 653b53de8a16cf5b02dfc522810dc1a65a61c0d9278571790c3e13bda7d7c31a

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_agent_traces-0.1.0-py3-none-any.whl:

Publisher: publish.yml on juliensimon/ocel-generator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page