Skip to main content

Generate agent training data from Codex and Pi traces

Project description

Teich

v2/ is the experimental trace-first package for collecting raw agent sessions and converting them into training-ready data.

What it does today

  • Runs Codex and Pi in a shared Docker runtime with uv, npm, @openai/codex, and @mariozechner/pi-coding-agent
  • Configures Codex through a mounted CODEX_HOME/config.toml
  • Configures Pi through an isolated mounted ~/.pi/agent/settings.json
  • Exports raw session traces from mounted Codex and Pi session directories
  • Writes a trace-folder README.md for upload
  • Exposes Python conversion helpers for training data preparation

Usage

# Initialize a project
uvx teich init my-project
cd my-project

# Run with the configured agent provider and model settings
uvx teich generate -c config.yaml

Local OSS providers

If you want Codex to talk to a local provider like LM Studio or Ollama, set the provider in config or env:

$env:TEICH_PROVIDER='LMstudio'
$env:TEICH_MODEL='gemma-4'
$env:TEICH_API_KEY='llm'
$env:TEICH_BASE_URL='http://localhost:1234/v1'
python -m teich generate -c test_run/config.yaml

v2 maps LMstudio and ollama onto Codex's native --oss --local-provider ... flow.

Configuration model

Important fields in config.yaml:

agent:
  provider: codex  # or pi

model:
  model: codex-mini-latest
  approval_policy: never
  sandbox: danger-full-access
  reasoning_effort: null

api:
  provider: openai
  base_url: null
  api_key: null

Legacy model.approval_mode is still accepted and normalized internally.

Python conversion API

from pathlib import Path
from teich import convert_traces_to_training_data

examples = convert_traces_to_training_data(Path("./output"))

The converter currently maps example-style raw traces into message/tool records with:

  • system/developer instructions
  • user messages
  • assistant messages
  • reasoning_content
  • tool calls
  • tool results

Development

uv pip install -e ".[dev]"
pytest tests/test_config.py tests/test_cli.py tests/test_runner.py -q

Architecture

  • Shared Docker runtime: container image includes Node.js, uv, uvx, @openai/codex, and @mariozechner/pi-coding-agent
  • Isolated Pi config: Pi runs with a mounted per-run ~/.pi/agent directory inside the container
  • Codex config: generated config.toml under a mounted CODEX_HOME
  • Session export: raw JSONL sessions are copied from mounted Codex or Pi session storage into the user output directory
  • Upload-first output: traces are preserved in raw form before later conversion
  • Provider-aware boundary: agent.provider selects either the Codex or Pi raw-trace path

Project Structure

v2/
├── docker/
│   └── codex-runtime.Dockerfile
├── src/teich/
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py
│   ├── config.py
│   ├── converter.py
│   ├── runner.py
│   └── trace_readme.py
└── tests/
    ├── test_cli.py
    ├── test_config.py
    └── test_runner.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

teich-0.1.1a3.tar.gz (223.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

teich-0.1.1a3-py3-none-any.whl (36.2 kB view details)

Uploaded Python 3

File details

Details for the file teich-0.1.1a3.tar.gz.

File metadata

  • Download URL: teich-0.1.1a3.tar.gz
  • Upload date:
  • Size: 223.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for teich-0.1.1a3.tar.gz
Algorithm Hash digest
SHA256 e8f4dd13556220ab649100fb94e9e3c27dff4cb4ee181356c16b477c1133ba47
MD5 d6c7a1fc67ddd9ff0360f00a7d82f196
BLAKE2b-256 18d26fa9a136ef8ad17068fbe2a5768442c53213082a3f69e9ce24c135376e70

See more details on using hashes here.

File details

Details for the file teich-0.1.1a3-py3-none-any.whl.

File metadata

  • Download URL: teich-0.1.1a3-py3-none-any.whl
  • Upload date:
  • Size: 36.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for teich-0.1.1a3-py3-none-any.whl
Algorithm Hash digest
SHA256 efdec6456e8cd6a35ab6753ad70ecec50d13c5434fb1bbe2301ce63e010f7669
MD5 273641cff29561cab5f862937da8b4ca
BLAKE2b-256 96b35aadf1c9930ecaf018263a0752ad2712f51b3ed74087873beefc0a507330

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page