Skip to main content

Generate agent training data from Codex and Pi traces

Project description

Teich

v2/ is the experimental trace-first package for collecting raw agent sessions and converting them into training-ready data.

What it does today

  • Runs Codex and Pi in a shared Docker runtime with uv, npm, @openai/codex, and @mariozechner/pi-coding-agent
  • Configures Codex through a mounted CODEX_HOME/config.toml
  • Configures Pi through an isolated mounted ~/.pi/agent/settings.json
  • Exports raw session traces from mounted Codex and Pi session directories
  • Writes a trace-folder README.md for upload
  • Exposes Python conversion helpers for training data preparation

Usage

# Initialize a project
uvx teich init my-project
cd my-project

# Run with the configured agent provider and model settings
uvx teich generate -c config.yaml

Local OSS providers

If you want Codex to talk to a local provider like LM Studio or Ollama, set the provider in config or env:

$env:TEICH_PROVIDER='LMstudio'
$env:TEICH_MODEL='gemma-4'
$env:TEICH_API_KEY='llm'
$env:TEICH_BASE_URL='http://localhost:1234/v1'
python -m teich generate -c test_run/config.yaml

v2 maps LMstudio and ollama onto Codex's native --oss --local-provider ... flow.

Configuration model

Important fields in config.yaml:

agent:
  provider: codex  # or pi

model:
  model: codex-mini-latest
  approval_policy: never
  sandbox: danger-full-access
  reasoning_effort: null

api:
  provider: openai
  base_url: null
  api_key: null

Legacy model.approval_mode is still accepted and normalized internally.

Python conversion API

from pathlib import Path
from teich import convert_traces_to_training_data

examples = convert_traces_to_training_data(Path("./output"))

The converter currently maps example-style raw traces into message/tool records with:

  • system/developer instructions
  • user messages
  • assistant messages
  • reasoning_content
  • tool calls
  • tool results

Development

uv pip install -e ".[dev]"
pytest tests/test_config.py tests/test_cli.py tests/test_runner.py -q

Architecture

  • Shared Docker runtime: container image includes Node.js, uv, uvx, @openai/codex, and @mariozechner/pi-coding-agent
  • Isolated Pi config: Pi runs with a mounted per-run ~/.pi/agent directory inside the container
  • Codex config: generated config.toml under a mounted CODEX_HOME
  • Session export: raw JSONL sessions are copied from mounted Codex or Pi session storage into the user output directory
  • Upload-first output: traces are preserved in raw form before later conversion
  • Provider-aware boundary: agent.provider selects either the Codex or Pi raw-trace path

Project Structure

v2/
├── docker/
│   └── codex-runtime.Dockerfile
├── src/teich/
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py
│   ├── config.py
│   ├── converter.py
│   ├── runner.py
│   └── trace_readme.py
└── tests/
    ├── test_cli.py
    ├── test_config.py
    └── test_runner.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

teich-0.1.1a2.tar.gz (223.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

teich-0.1.1a2-py3-none-any.whl (36.1 kB view details)

Uploaded Python 3

File details

Details for the file teich-0.1.1a2.tar.gz.

File metadata

  • Download URL: teich-0.1.1a2.tar.gz
  • Upload date:
  • Size: 223.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for teich-0.1.1a2.tar.gz
Algorithm Hash digest
SHA256 71f8d8f4bd8436a9656d1b0dc5a71ff2f97a0f8cc5b6673cb2dd0fb3aa892929
MD5 5dada5ef3b95e66039b9be77ca7e98e0
BLAKE2b-256 63e73363348bd7fd8afc7847662a5668c9503ffe3eb65502843b7a0175864f35

See more details on using hashes here.

File details

Details for the file teich-0.1.1a2-py3-none-any.whl.

File metadata

  • Download URL: teich-0.1.1a2-py3-none-any.whl
  • Upload date:
  • Size: 36.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for teich-0.1.1a2-py3-none-any.whl
Algorithm Hash digest
SHA256 d56f86980426431aba13ac6e6659535aa6051c9ee33a27cdad270b8df1c232e1
MD5 f6e064f58d7116af432604ba028251bd
BLAKE2b-256 39e47aff99ece7a9fc3fe10cc76d7a2bec19c8b79f2e6665ca4476b687c9b317

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page