Skip to main content

lightweight command-line interface that makes it easy to test VoiceChatEngine

Project description

VoxTerm - Minimalist CLI for Voice Chat

VoxTerm is a lightweight command-line interface that makes it easy to use voice chat APIs (like OpenAI's Realtime API) from your terminal. No fancy UI, no complex frameworks - just simple keyboard controls for voice conversations.

📋 What is VoxTerm?

VoxTerm is a thin CLI wrapper that adds keyboard controls to voice engines. Think of it as the minimal glue between your keyboard and a voice API:

Your Keyboard → VoxTerm → VoiceEngine → AI Voice API

🎯 Philosophy

  • Minimalist: ~500 lines of code total
  • Simple: Just keyboard input and print statements
  • Focused: Does one thing - CLI controls for voice chat
  • Non-blocking: Never interferes with real-time audio
  • Flexible: Works with any voice engine that has the right methods

🚀 Quick Start

from voxterm import VoxTermCLI
from voicechatengine import VoiceEngine

# Create your voice engine
engine = VoiceEngine(api_key="your-key")

# Wrap it with VoxTerm
cli = VoxTermCLI(engine, mode="push_to_talk")

# Run it
import asyncio
asyncio.run(cli.run())

That's it! Now you have:

  • Hold SPACE to talk
  • Press M to mute
  • Press Q to quit

📁 Project Structure

VoxTerm is intentionally tiny:

voxterm/
├── __init__.py      # Package exports
├── cli.py           # Main CLI class (100 lines)
├── modes.py         # Input modes (200 lines)
└── keyboard.py      # Keyboard handling (200 lines)

cli.py - The Main CLI

class VoxTermCLI:
    def __init__(self, voice_engine, mode="push_to_talk"):
        self.engine = voice_engine
        self.mode = mode
        
    async def run(self):
        # Connect engine
        # Setup keyboard
        # Print messages
        # That's all!

modes.py - Input Modes

Simple classes that handle different interaction patterns:

  • PushToTalkMode: Hold key → record → release → send
  • AlwaysOnMode: Continuous listening with VAD
  • TextMode: Type messages instead of speaking
  • TurnBasedMode: Explicit turn-taking

Each mode is just a class with on_key_down() and on_key_up() methods.

keyboard.py - Keyboard Input

Basic keyboard handling that works across platforms:

keyboard = SimpleKeyboard()
keyboard.on_space(on_press_func, on_release_func)
keyboard.on_key('m', mute_func)
keyboard.start()

🎮 Usage Modes

Push-to-Talk (Default)

$ python -m voxterm --mode ptt

🎤 Voice Chat (push_to_talk mode)
Commands: [space] talk, [m] mute, [q] quit

[Hold SPACE to talk...]
🔴 Recording... (2.3s) Sending...
You: How's the weather today?
AI: I don't have access to real-time weather data...

Always-On (VAD)

$ python -m voxterm --mode always_on

🎤 Always listening (VAD active)
[Just speak naturally, AI will respond when you pause]

Text Mode

$ python -m voxterm --mode text

💬 Type your messages:
You: Hello!
AI: Hi there! How can I help you today?

🔧 Integration

VoxTerm expects a voice engine with these methods:

# Required methods
async engine.connect()
async engine.disconnect()
async engine.start_listening()
async engine.stop_listening()
async engine.send_text(text: str)

# Required callbacks
engine.on_text_response = func(text: str)
engine.on_user_transcript = func(text: str)

Works out of the box with:

  • voicechatengine.VoiceEngine
  • Any engine with a similar interface

🎨 Customization

Custom Modes

class MyCustomMode:
    def __init__(self, engine):
        self.engine = engine
        
    async def on_key_down(self, key: str):
        if key == "r":  # Custom recording key
            await self.engine.start_listening()

Custom Key Bindings

cli = VoxTermCLI(engine)
cli.keyboard.on_key('t', lambda: print("Custom action!"))

🚫 What VoxTerm Doesn't Do

  • ❌ No UI rendering or colors
  • ❌ No audio processing
  • ❌ No network/WebSocket handling
  • ❌ No state management
  • ❌ No configuration files
  • ❌ No fancy terminal graphics

VoxTerm just connects your keyboard to a voice engine. The voice engine handles everything else.

📊 Why So Simple?

Real-world usage showed that for CLI voice chat, you need:

  1. A way to trigger recording (keyboard)
  2. A way to see what was said (print)
  3. Different modes for different use cases

That's exactly what VoxTerm provides - nothing more, nothing less.

🏃 Example: Complete Voice Chat in 10 Lines

import asyncio
from voxterm import VoxTermCLI
from voicechatengine import VoiceEngine

async def main():
    engine = VoiceEngine(api_key="your-key")
    cli = VoxTermCLI(engine, mode="push_to_talk")
    await cli.run()

if __name__ == "__main__":
    asyncio.run(main())

📝 License

MIT - Use it however you want!


Remember: VoxTerm is just the keyboard controls. Your voice engine does the actual work. We just make it easy to use from the command line! 🎤

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxterm-0.0.3.tar.gz (29.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxterm-0.0.3-py3-none-any.whl (34.1 kB view details)

Uploaded Python 3

File details

Details for the file voxterm-0.0.3.tar.gz.

File metadata

  • Download URL: voxterm-0.0.3.tar.gz
  • Upload date:
  • Size: 29.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for voxterm-0.0.3.tar.gz
Algorithm Hash digest
SHA256 6bd80d08fb349f2668c165a8884a96fe98ce784cbb26f741c6ce89b881a6c611
MD5 eaf093fafedc01f3992856335559f6a6
BLAKE2b-256 10d2c4bf7cd5b5e28bd0d1894c4dc4340006b11d89baa247c96ec3cd8f13b50e

See more details on using hashes here.

File details

Details for the file voxterm-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: voxterm-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 34.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for voxterm-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 36d459b830f34b3afbcfc9903f4f3f695fb89121938f70c694f0f587785b4ded
MD5 0e2359df6ca81a19de749a4da235cac3
BLAKE2b-256 dd9fbdfcae0f5e36d340ffe6e56ae19aa2926eef88dc146b0003d77be205b924

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page