Voice-driven AI assistant for real-time transcription and Gemini integration.
Project description
VoxAI
Voice-driven AI assistant that captures audio from any app (Teams, Zoom, browsers) and streams live, textual answers—ideal for interviews, meetings, and knowledge work.
🚀 Quickstart
-
Clone the repo
git clone https://github.com/rtiwariops/voxai.git cd voxai
-
Create & activate a Python virtual environment
python3 -m venv .venv source .venv/bin/activate # macOS/Linux # .venv\Scripts\activate # Windows PowerShell
-
Install Python dependencies
pip install --upgrade pip pip install .
-
Install Electron (if not already)
npm install -g electron
-
Configure environment variables
cp .env.example .env
-
Edit
.envand provide:GENAI_API_KEY=sk-... GENAI_MODEL=gemini-1.5-flash
-
Run VoxAI
VoxAI
🎧 How It Works
-
Start Recording
Click Start to capture audio from your input device (e.g., BlackHole). -
Stop & Transcribe
Click Stop. Whisper transcribes and shows the full audio clip under Transcript. -
Ask AI
Click Ask AI to send the transcript to Gemini. Answers stream live in the Answer panel. -
Copy & Share
Output is plain text—reuse it in RAG/finetune workflows or anywhere else.
⚙️ Features
- Universal Audio Capture (BlackHole, etc.)
- Manual Control for long-form Q&A
- One-Shot, Full-Clip Transcription
- Live Token-by-Token AI Streaming
- Configurable Gemini Model (
.env) - Electron-Based Desktop UI
📦 Installation Options
From source:
git clone https://github.com/YourUsername/VoxAI.git
cd VoxAI
pip install -e .
VoxAI
From PyPI:
pip install VoxAI
VoxAI
🛠 Configuration
Set in .env:
| Variable | Description | Example |
|---|---|---|
GENAI_API_KEY |
Google Generative AI (Gemini) API key | sk-… |
GENAI_MODEL |
Gemini model to use | gemini-1.5-flash |
See .env.example for reference.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voxai-0.1.2.tar.gz.
File metadata
- Download URL: voxai-0.1.2.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a56c3131424d6b163bbf892511ca6ebfb87627359513a52d74e874eaff8a9b59
|
|
| MD5 |
077bbd34aabda5eb6d97a372382b16c0
|
|
| BLAKE2b-256 |
afaeb47cbc637dc09cf15b5af5d3487183536d722affb8b473039c7e71a93ea8
|
File details
Details for the file voxai-0.1.2-py3-none-any.whl.
File metadata
- Download URL: voxai-0.1.2-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e5b4dffaf982c2faef17a75c9ebeb3c11c6ebd2ba400bc24085ace3bc21ef76
|
|
| MD5 |
9c5c36dd2bf63d3d8af23109b63929ff
|
|
| BLAKE2b-256 |
e833968e9e08ff732cd27ba21e7dc65d856d2de61ca04f528c9df823950235e9
|