lightweight command-line interface that makes it easy to test VoiceChatEngine
Project description
VoxTerm - Minimalist CLI for Voice Chat
VoxTerm is a lightweight command-line interface that makes it easy to use voice chat APIs (like OpenAI's Realtime API) from your terminal. No fancy UI, no complex frameworks - just simple keyboard controls for voice conversations.
📋 What is VoxTerm?
VoxTerm is a thin CLI wrapper that adds keyboard controls to voice engines. Think of it as the minimal glue between your keyboard and a voice API:
Your Keyboard → VoxTerm → VoiceEngine → AI Voice API
🎯 Philosophy
- Minimalist: ~500 lines of code total
- Simple: Just keyboard input and print statements
- Focused: Does one thing - CLI controls for voice chat
- Non-blocking: Never interferes with real-time audio
- Flexible: Works with any voice engine that has the right methods
🚀 Quick Start
from voxterm import VoxTermCLI
from voicechatengine import VoiceEngine
# Create your voice engine
engine = VoiceEngine(api_key="your-key")
# Wrap it with VoxTerm
cli = VoxTermCLI(engine, mode="push_to_talk")
# Run it
import asyncio
asyncio.run(cli.run())
That's it! Now you have:
- Hold SPACE to talk
- Press M to mute
- Press Q to quit
📁 Project Structure
VoxTerm is intentionally tiny:
voxterm/
├── __init__.py # Package exports
├── cli.py # Main CLI class (100 lines)
├── modes.py # Input modes (200 lines)
└── keyboard.py # Keyboard handling (200 lines)
cli.py - The Main CLI
class VoxTermCLI:
def __init__(self, voice_engine, mode="push_to_talk"):
self.engine = voice_engine
self.mode = mode
async def run(self):
# Connect engine
# Setup keyboard
# Print messages
# That's all!
modes.py - Input Modes
Simple classes that handle different interaction patterns:
- PushToTalkMode: Hold key → record → release → send
- AlwaysOnMode: Continuous listening with VAD
- TextMode: Type messages instead of speaking
- TurnBasedMode: Explicit turn-taking
Each mode is just a class with on_key_down() and on_key_up() methods.
keyboard.py - Keyboard Input
Basic keyboard handling that works across platforms:
keyboard = SimpleKeyboard()
keyboard.on_space(on_press_func, on_release_func)
keyboard.on_key('m', mute_func)
keyboard.start()
🎮 Usage Modes
Push-to-Talk (Default)
$ python -m voxterm --mode ptt
🎤 Voice Chat (push_to_talk mode)
Commands: [space] talk, [m] mute, [q] quit
[Hold SPACE to talk...]
🔴 Recording... (2.3s) Sending...
You: How's the weather today?
AI: I don't have access to real-time weather data...
Always-On (VAD)
$ python -m voxterm --mode always_on
🎤 Always listening (VAD active)
[Just speak naturally, AI will respond when you pause]
Text Mode
$ python -m voxterm --mode text
💬 Type your messages:
You: Hello!
AI: Hi there! How can I help you today?
🔧 Integration
VoxTerm expects a voice engine with these methods:
# Required methods
async engine.connect()
async engine.disconnect()
async engine.start_listening()
async engine.stop_listening()
async engine.send_text(text: str)
# Required callbacks
engine.on_text_response = func(text: str)
engine.on_user_transcript = func(text: str)
Works out of the box with:
voicechatengine.VoiceEngine- Any engine with a similar interface
🎨 Customization
Custom Modes
class MyCustomMode:
def __init__(self, engine):
self.engine = engine
async def on_key_down(self, key: str):
if key == "r": # Custom recording key
await self.engine.start_listening()
Custom Key Bindings
cli = VoxTermCLI(engine)
cli.keyboard.on_key('t', lambda: print("Custom action!"))
🚫 What VoxTerm Doesn't Do
- ❌ No UI rendering or colors
- ❌ No audio processing
- ❌ No network/WebSocket handling
- ❌ No state management
- ❌ No configuration files
- ❌ No fancy terminal graphics
VoxTerm just connects your keyboard to a voice engine. The voice engine handles everything else.
📊 Why So Simple?
Real-world usage showed that for CLI voice chat, you need:
- A way to trigger recording (keyboard)
- A way to see what was said (print)
- Different modes for different use cases
That's exactly what VoxTerm provides - nothing more, nothing less.
🏃 Example: Complete Voice Chat in 10 Lines
import asyncio
from voxterm import VoxTermCLI
from voicechatengine import VoiceEngine
async def main():
engine = VoiceEngine(api_key="your-key")
cli = VoxTermCLI(engine, mode="push_to_talk")
await cli.run()
if __name__ == "__main__":
asyncio.run(main())
📝 License
MIT - Use it however you want!
Remember: VoxTerm is just the keyboard controls. Your voice engine does the actual work. We just make it easy to use from the command line! 🎤
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voxterm-0.0.2.tar.gz.
File metadata
- Download URL: voxterm-0.0.2.tar.gz
- Upload date:
- Size: 29.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d30bac233bfee208ea01547e0878b8e01394b892acff0e657ab8a318b60fc31
|
|
| MD5 |
0db995330dd4b7fc55fdf134b0f4c759
|
|
| BLAKE2b-256 |
9176375581f38c029b6ec1f62d6e2feef4f13a2705084f2eecac8eb747287214
|
File details
Details for the file voxterm-0.0.2-py3-none-any.whl.
File metadata
- Download URL: voxterm-0.0.2-py3-none-any.whl
- Upload date:
- Size: 34.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64487c93f1858fbbae3622e8ca53553a7d7ee2acb81fb392174db495b4da5568
|
|
| MD5 |
5676132a692c80d1a52d767035128857
|
|
| BLAKE2b-256 |
1aa110894fc0c48c7199e2cc5eac0a880e28d3311d4e727f463c8fc437180966
|