Skip to main content

Generate agent training data from Codex and Pi traces

Project description

Teich

v2/ is the experimental trace-first package for collecting raw agent sessions and converting them into training-ready data.

What it does today

  • Runs Codex and Pi in a shared Docker runtime with uv, npm, @openai/codex, and @mariozechner/pi-coding-agent
  • Configures Codex through a mounted CODEX_HOME/config.toml
  • Configures Pi through an isolated mounted ~/.pi/agent/settings.json
  • Exports raw session traces from mounted Codex and Pi session directories
  • Writes a trace-folder README.md for upload
  • Exposes Python conversion helpers for training data preparation

Usage

# Initialize a project
uvx teich init my-project
cd my-project

# Run with the configured agent provider and model settings
uvx teich generate -c config.yaml

Local OSS providers

If you want Codex to talk to a local provider like LM Studio or Ollama, set the provider in config or env:

$env:TEICH_PROVIDER='LMstudio'
$env:TEICH_MODEL='gemma-4'
$env:TEICH_API_KEY='llm'
$env:TEICH_BASE_URL='http://localhost:1234/v1'
python -m teich generate -c test_run/config.yaml

v2 maps LMstudio and ollama onto Codex's native --oss --local-provider ... flow.

Configuration model

Important fields in config.yaml:

agent:
  provider: codex  # or pi

model:
  model: codex-mini-latest
  approval_policy: never
  sandbox: danger-full-access
  reasoning_effort: null

api:
  provider: openai
  base_url: null
  api_key: null

Legacy model.approval_mode is still accepted and normalized internally.

Python conversion API

from pathlib import Path
from teich import convert_traces_to_training_data

examples = convert_traces_to_training_data(Path("./output"))

The converter currently maps example-style raw traces into message/tool records with:

  • system/developer instructions
  • user messages
  • assistant messages
  • reasoning_content
  • tool calls
  • tool results

Development

uv pip install -e ".[dev]"
pytest tests/test_config.py tests/test_cli.py tests/test_runner.py -q

Architecture

  • Shared Docker runtime: container image includes Node.js, uv, uvx, @openai/codex, and @mariozechner/pi-coding-agent
  • Isolated Pi config: Pi runs with a mounted per-run ~/.pi/agent directory inside the container
  • Codex config: generated config.toml under a mounted CODEX_HOME
  • Session export: raw JSONL sessions are copied from mounted Codex or Pi session storage into the user output directory
  • Upload-first output: traces are preserved in raw form before later conversion
  • Provider-aware boundary: agent.provider selects either the Codex or Pi raw-trace path

Project Structure

v2/
├── docker/
│   └── codex-runtime.Dockerfile
├── src/teich/
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py
│   ├── config.py
│   ├── converter.py
│   ├── runner.py
│   └── trace_readme.py
└── tests/
    ├── test_cli.py
    ├── test_config.py
    └── test_runner.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

teich-0.1.1a1.tar.gz (72.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

teich-0.1.1a1-py3-none-any.whl (35.9 kB view details)

Uploaded Python 3

File details

Details for the file teich-0.1.1a1.tar.gz.

File metadata

  • Download URL: teich-0.1.1a1.tar.gz
  • Upload date:
  • Size: 72.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for teich-0.1.1a1.tar.gz
Algorithm Hash digest
SHA256 f331d9783457518d6873b017e9d94045ec2e7390bccc6d31ab4a78f2a25a812a
MD5 9d4af140f92d6ff0b76d172bb2f2042d
BLAKE2b-256 0d27436cf91720edec6fc3e009b3397b4350510106d43fe3b24e37738b301f96

See more details on using hashes here.

File details

Details for the file teich-0.1.1a1-py3-none-any.whl.

File metadata

  • Download URL: teich-0.1.1a1-py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for teich-0.1.1a1-py3-none-any.whl
Algorithm Hash digest
SHA256 6444fd0e83e85057ee7bc424b42f56a058ce38bea74f87e5cacc285311b417da
MD5 0c58171522c3c301a48b4c1d1ed32b03
BLAKE2b-256 0028f596545cc45bd87d035488c28909420d47fdf1141baaf6d04dbf0a8e2b77

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page