Voice-driven AI assistant for real-time transcription and Gemini integration.

These details have not been verified by PyPI

Project links

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Artificial Intelligence

Project description

VoxAI

VoxAI Logo

🚀 Features

Universal Audio Capture
Record mic, system audio, or any combination via BlackHole (macOS) or equivalent loopback drivers.
Manual Control
Start/Stop buttons let you define exactly the boundaries of your prompt—perfect for long, multi-sentence queries.
One-Shot Transcription
Whisper processes the entire recording in a single call—no fragmented sentences.
Live AI Streaming
Gemini’s response appears token by token, just like in ChatGPT’s streaming interface.
Configurable Model
Swap between Gemini variants via environment (no code changes).
Zero-Install UI Bootstrapping
On first run voxai auto-installs the minimal Electron UI source—included in the PyPI package—so you only ever need pip install voxai and voxai.

🎯 Quickstart

⚙️ Prerequisites

Python ≥3.7
Node.js ≥14
Electron (global)
```
npm install -g electron
```
macOS Audio Loopback (BlackHole 2ch + Ladiocast):
1. Install BlackHole 2ch::
  
  brew install blackhole-2ch
2. Create a Multi-Output Device
  - Open Audio MIDI Setup
  - Click “+” → Create Multi-Output Device
  - Check both BlackHole 2ch and MacBook Pro Speakers
3. Select Multi-Output Device
  System Preferences → Sound → Output → Multi-Output Device
4. Configure Ladiocast
  - Input 1: BlackHole 2ch → route to Main
  - Main Output: MacBook Pro Speakers
  - Mute other channels
This will route all system and microphone audio into VoxAI.

Install & Run

1) Install from PyPI

pip install voxai

2) Configure environment

cp .env.example .env

Edit `.env`:

GENAI_API_KEY=sk-…

GENAI_MODEL=gemini-1.5-flash

3) Launch the app

voxai

On first launch, VoxAI will automatically run npm install inside its bundled electron/ folder, then open a desktop window.

📖 Usage

Once the UI opens:

Start: Click Start to begin recording.

Speak: Listen to question`s from zoom, teams or any other tool aloud—no length limit.

Stop: Click Stop. Whisper transcribes your entire clip to the Transcript pane.

Ask AI: Click Ask AI. Gemini’s answer streams live into the Answer pane.

Copy & Share: Everything is plain text—copy it, paste it, or feed it into your own RAG/finetuning pipeline.

🎧 How It Works

Start Recording
Click Start to capture audio from your input device (e.g., BlackHole).
Stop & Transcribe
Click Stop. Whisper transcribes and shows the full audio clip under Transcript.
Ask AI
Click Ask AI to send the transcript to Gemini. Answers stream live in the Answer panel.
Copy & Share
Output is plain text—reuse it in RAG/finetune workflows or anywhere else.

🛠 Configuration

Set in .env:

Variable	Description	Example
`GENAI_API_KEY`	Google Generative AI (Gemini) API key	`sk-…`
`GENAI_MODEL`	Gemini model to use	`gemini-1.5-flash`

See .env.example for reference.

🛠️ Developer Guide

Install from source

git clone https://github.com/rtiwariops/voxai.git

cd voxai

pip install -e .

npm install -g electron

voxai

⚙️ Features

Universal Audio Capture (BlackHole, etc.)
Manual Control for long-form Q&A
One-Shot, Full-Clip Transcription
Live Token-by-Token AI Streaming
Configurable Gemini Model (.env)
Electron-Based Desktop UI

📜 License

Released under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Artificial Intelligence

Release history Release notifications | RSS feed

0.4.1

Aug 26, 2025

0.4.0

Aug 19, 2025

0.3.3

Jul 31, 2025

0.3.2

Jul 27, 2025

0.3.0

Jul 17, 2025

0.2.17

Jul 2, 2025

0.2.16

Jul 2, 2025

0.2.15

Jul 2, 2025

0.2.14

Jul 2, 2025

0.2.13

Jul 2, 2025

0.2.12

Jul 2, 2025

0.2.11

Jul 2, 2025

0.2.10

Jul 2, 2025

0.2.8

Jun 30, 2025

0.2.7

Jun 30, 2025

0.2.6

Jun 30, 2025

0.2.5

Jun 30, 2025

0.2.4

Jun 30, 2025

0.2.3

Jun 30, 2025

0.2.2

Jun 30, 2025

0.2.1

Jun 30, 2025

0.2.0

Jun 30, 2025

0.1.99

Jun 30, 2025

0.1.97

Jun 27, 2025

0.1.96

Jun 27, 2025

0.1.95

Jun 27, 2025

0.1.94

Jun 27, 2025

0.1.92

Jun 25, 2025

0.1.91

Jun 25, 2025

0.1.90

Jun 25, 2025

0.1.89

Jun 25, 2025

0.1.88

Jun 25, 2025

0.1.87

Jun 25, 2025

0.1.86

Jun 25, 2025

0.1.85

Jun 25, 2025

0.1.84

Jun 25, 2025

0.1.83

Jun 25, 2025

0.1.82

Jun 25, 2025

0.1.81

Jun 25, 2025

0.1.79

Jun 25, 2025

0.1.78

Jun 25, 2025

0.1.77

Jun 25, 2025

0.1.76

Jun 25, 2025

0.1.75

Jun 13, 2025

0.1.74

Jun 13, 2025

0.1.73

Jun 13, 2025

0.1.72

Jun 13, 2025

0.1.71

Jun 13, 2025

0.1.70

Jun 13, 2025

0.1.69

Jun 13, 2025

0.1.68

Jun 13, 2025

0.1.67

Jun 13, 2025

0.1.66

Jun 13, 2025

0.1.65

Jun 13, 2025

0.1.64

Jun 13, 2025

0.1.63

Jun 13, 2025

0.1.62

Jun 13, 2025

0.1.61

Jun 13, 2025

0.1.60

Jun 13, 2025

0.1.58

Jun 13, 2025

0.1.57

Jun 13, 2025

0.1.56

Jun 13, 2025

0.1.55

Jun 6, 2025

0.1.54

Jun 6, 2025

0.1.53

Jun 4, 2025

0.1.52

Jun 4, 2025

0.1.51

Jun 4, 2025

0.1.49

Jun 3, 2025

0.1.46

Jun 3, 2025

0.1.43

Jun 1, 2025

0.1.42

Jun 1, 2025

0.1.41

Jun 1, 2025

0.1.40

Jun 1, 2025

This version

0.1.20

May 19, 2025

0.1.19

May 19, 2025

0.1.18

May 19, 2025

0.1.17

May 19, 2025

0.1.16

May 19, 2025

0.1.15

May 13, 2025

0.1.14

May 13, 2025

0.1.13

May 13, 2025

0.1.11

May 12, 2025

0.1.10

May 12, 2025

0.1.9

May 12, 2025

0.1.8

May 12, 2025

0.1.7

May 12, 2025

0.1.6

May 12, 2025

0.1.5

May 12, 2025

0.1.3

May 12, 2025

0.1.2

May 12, 2025

0.1.1

May 12, 2025

0.1.0

May 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxai-0.1.20.tar.gz (10.8 kB view details)

Uploaded May 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voxai-0.1.20-py3-none-any.whl (9.5 kB view details)

Uploaded May 19, 2025 Python 3

File details

Details for the file voxai-0.1.20.tar.gz.

File metadata

Download URL: voxai-0.1.20.tar.gz
Upload date: May 19, 2025
Size: 10.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for voxai-0.1.20.tar.gz
Algorithm	Hash digest
SHA256	`ab44917d052a067baa41ea34c5771c38a9a0b920f0b0df7d83759f07a76a8192`
MD5	`0f2cb42acd16a6fcf591a44574022611`
BLAKE2b-256	`db69e3972d56b9e970a8454f86b7f1acd92c7667d1bd250a716d99bf657583f6`

See more details on using hashes here.

File details

Details for the file voxai-0.1.20-py3-none-any.whl.

File metadata

Download URL: voxai-0.1.20-py3-none-any.whl
Upload date: May 19, 2025
Size: 9.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for voxai-0.1.20-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e2bb2573852f06c63246b2d479e096a123dc6fba846a0722f1598926fa21d158`
MD5	`94a3aad177d94760e7885cad161f9a6c`
BLAKE2b-256	`0cb41255f59dc02290e4e785bf3e4cfbacc9e5db5f49d0cdf24581db53ed8813`

See more details on using hashes here.

voxai 0.1.20

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

VoxAI

🚀 Features

🎯 Quickstart

⚙️ Prerequisites

Install & Run

1) Install from PyPI

2) Configure environment

Edit .env:

3) Launch the app

📖 Usage

🎧 How It Works

🛠 Configuration

🛠️ Developer Guide

Install from source

⚙️ Features

📜 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Edit `.env`: