Skip to main content

Codex-backed text transformation and Kokoro TTS command-line tools.

Project description

agent-tools

Python CLI tools for:

  • transforming raw text into TTS-ready narration and synthesizing it in one command
  • transforming piped text through the private Codex ChatGPT-backed backend used by local Codex
  • synthesizing the result to WAV with Kokoro-82M

This repo is intentionally wired to the local Codex login state in ~/.codex/.

Status

This is an experimental public package with a private Codex dependency.

The transform command mirrors the current request shape used by the local Codex source tree and depends on ChatGPT-backed auth in ~/.codex/auth.json.

It does not use:

  • codex exec
  • codex app-server
  • the public OpenAI API key flow

That means:

  • you must already be logged into local Codex
  • backend compatibility can break if Codex internals or backend contracts change
  • this package is best suited for users who already use local Codex

Requirements

  • Python 3.12+
  • local Codex already logged in via ChatGPT
  • espeak-ng installed for best Kokoro English fallback behavior

Install

cd repos/agent-tools
uv venv
uv pip install -e ".[dev]"

Public package install:

pip install ai-nd-co-agent-tools

UI-enabled install:

pip install "ai-nd-co-agent-tools[ui]"

Install a CUDA-enabled PyTorch runtime for this CLI environment:

agent-tools install-cuda

Pass an explicit track if you do not want auto-detection:

agent-tools install-cuda --cuda-track cu130

Usage

Single-command path: ttsify

echo "Turn this note into natural spoken narration." | agent-tools ttsify --output-file out.wav

ttsify uses a built-in rewrite prompt stored in the package and then pipes the transformed text into Kokoro TTS.

Default ttsify settings:

  • model: gpt-5.4-mini
  • voice: af_heart

Configurable via env vars:

AGENT_TOOLS_CODEX_MODEL=gpt-5.4-mini
AGENT_TOOLS_CODEX_REASONING_EFFORT=medium
AGENT_TOOLS_KOKORO_VOICE=af_heart
AGENT_TOOLS_KOKORO_LANGUAGE=a
AGENT_TOOLS_KOKORO_SPEED=1.0
AGENT_TOOLS_KOKORO_DEVICE=auto

CLI flags override env vars.

Queue for playback on Windows:

echo "Turn this note into natural spoken narration." | agent-tools ttsify --output-mode play --source agent-a

Codex integration

Install the supported Codex integration for the current platform:

agent-tools install-codex-integration
  • On native Windows Codex, this installs a notify command in ~/.codex/config.toml.
  • On non-Windows, this keeps the Stop-hook integration path.
  • The compatibility alias agent-tools install-codex-stop-hook remains available.

Windows debug logs:

  • ~/.codex/notify_tts.log
  • ~/.codex/notify_tts_agent_tools.log

On Windows, Codex passes the notify payload as the final JSON argv argument to the installed Python command. No PowerShell or bash wrapper is used.

This enqueues the generated audio, starts the background controller if needed, and returns immediately.

Transform text

echo "Rewrite this into short spoken narration." | agent-tools transform \
  --system-prompt-file prompt_examples/rewrite_for_tts.md

Optional controls:

echo "Input text" | agent-tools transform \
  --system-prompt-file prompt_examples/rewrite_for_tts.md \
  --model gpt-5 \
  --reasoning-effort medium \
  --fast

Text to speech

echo "Hello world." | agent-tools tts --output-file hello.wav

Queue already-prepared speech on Windows:

echo "Hello world." | agent-tools tts --output-mode play --source agent-a

Desktop controller UI

agent-tools ui

If the controller is already running, this focuses the existing window instead of starting a second process.

End-to-end pipeline

cat input.txt | agent-tools transform \
  --system-prompt-file prompt_examples/rewrite_for_tts.md \
  | agent-tools tts --voice af_heart --output-file out.wav

Notes

  • ttsify is the recommended end-user path; transform and tts remain available as building blocks.
  • transform reads stdin by default and writes plain text to stdout.
  • tts reads stdin by default and writes WAV bytes to stdout unless --output-file is set.
  • tts and ttsify support --output-mode play on Windows.
  • in play mode, audio is queued into a single background controller process.
  • agent-tools ui launches or focuses the popup/tray controller window.
  • controller shortcuts: Space pause/resume, Esc stop, Ctrl+R replay, Ctrl+N next.
  • tts and ttsify default to --device auto.
  • auto device selection uses a real CUDA probe, not just torch.cuda.is_available().
  • agent-tools install-cuda installs a CUDA-enabled PyTorch build into the current Python environment and validates it in a fresh subprocess by default.
  • transform refreshes ChatGPT tokens when the Codex backend returns 401.
  • Native Windows Codex uses notify; hooks.json lifecycle hooks are not used there.
  • semantic-release now owns future Python package version bumps and py-v* tags.

CPU performance

Measured on this machine on April 15, 2026 with forced CPU:

Scenario Wall time Audio time Real-time factor
first-ever cold init after dependency/model setup ~43.1s n/a n/a
cached init ~2.9s n/a n/a
warm short 0.309s 4.80s 0.064
warm medium 1.199s 15.53s 0.077
warm long 2.514s 26.70s 0.094

Interpretation:

  • warm CPU generation on this machine is about 10x-15x faster than realtime
  • the main cost is cold startup/model load, not steady-state synthesis

To reproduce locally:

python scripts/benchmark_tts_cpu.py

Troubleshooting

  • Missing ~/.codex/auth.json: run codex login
  • Expired auth: rerun codex login if refresh fails permanently
  • Missing espeak-ng: install it for better English fallback behavior
  • Slow first run: expected; Kokoro downloads voices/models and initializes the pipeline

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_nd_co_agent_tools-0.3.0.tar.gz (51.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_nd_co_agent_tools-0.3.0-py3-none-any.whl (49.4 kB view details)

Uploaded Python 3

File details

Details for the file ai_nd_co_agent_tools-0.3.0.tar.gz.

File metadata

  • Download URL: ai_nd_co_agent_tools-0.3.0.tar.gz
  • Upload date:
  • Size: 51.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_nd_co_agent_tools-0.3.0.tar.gz
Algorithm Hash digest
SHA256 6181f6f2f252f5ae0bf8310ca73644c057b460293d7ecc42b425cb92faf5a841
MD5 5633e9905f002607b56f9abe7902a8cc
BLAKE2b-256 e6053255b171b9b8fed4412e47bf0515ed8dd1f5528e0280425e871cb282ef20

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_nd_co_agent_tools-0.3.0.tar.gz:

Publisher: release.yml on ai-nd-co/agent-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_nd_co_agent_tools-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_nd_co_agent_tools-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 72fabee22d2d69e879f6d668d7776da7bd446232bd2b388b53c6bd5ddfc75dd7
MD5 2693125225a3fca3732e47c295a796a7
BLAKE2b-256 4506b6a9ffef28a37d7a09296e5cb905c89326f39d9d5736e1bdecd984c71e08

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_nd_co_agent_tools-0.3.0-py3-none-any.whl:

Publisher: release.yml on ai-nd-co/agent-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page