Local Voice Transcription System - Privacy-first, model-agnostic speech-to-text

These details have not been verified by PyPI

Project description

🎤 Locivox

Local Voice Transcription System - Privacy-first, model-agnostic speech-to-text powered by AI

Locivox (Latin: loci = local, vox = voice) is an open-source STT system designed to run entirely on your machine with no cloud dependencies. Start with Whisper, expand to any model.

✨ Features (Phase 1 - MVP)

✅ Real-time microphone capture with configurable settings
✅ Multiple STT engines: Faster-Whisper (recommended) and OpenAI-Whisper
✅ CPU-optimized for laptops without GPU
✅ Model-agnostic architecture - easily add new engines
✅ Multiple output formats: TXT, JSON, SRT subtitles
✅ Automatic language detection or manual selection
✅ Self-contained virtual environment - no global dependencies

🚀 Quick Start

Prerequisites

Python 3.9 or higher
FFmpeg (required for audio processing)

Install FFmpeg:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg

# Windows (use Chocolatey)
choco install ffmpeg

Installation

Clone or download the project:

cd locivox

Create virtual environment:

python -m venv venv

# Activate it:
# macOS/Linux:
source venv/bin/activate

# Windows:
venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

This will download the required models on first run (~140MB for base model).

💻 Usage

Interactive Recording Mode

Record from your microphone and transcribe:

python src/cli.py

Workflow:

Select your microphone device
Press ENTER to start recording
Speak into your microphone
Press ENTER to stop
Transcription appears in console and saves to output/ folder

Transcribe Existing Audio File

python src/cli.py --file path/to/audio.wav

Advanced Options

# Use a different model size
python src/cli.py --model small

# Force a specific language (skip auto-detection)
python src/cli.py --language es

# Change output format
python src/cli.py --output-format json

# Use custom config file
python src/cli.py --config my_config.yaml

# Combine options
python src/cli.py --file audio.mp3 --model medium --output-format srt

⚙️ Configuration

Edit config.yaml to customize behavior:

model:
  engine: "faster-whisper"  # or "openai-whisper"
  size: "base"              # tiny, base, small, medium, large
  language: "en"            # or "auto" for detection

audio:
  sample_rate: 16000        # Whisper expects 16kHz
  chunk_duration: 5         # Seconds per chunk

output:
  format: "txt"             # txt, json, srt
  timestamp: true           # Include timestamp in filename

Model Sizes & Performance

Model	Size	Speed (CPU)	Quality	Memory
tiny	39M	~10x RT	Basic	<1GB
base	74M	~5x RT	Good	~1GB
small	244M	~3x RT	Better	~2GB
medium	769M	~1x RT	Great	~5GB
large	1.5G	~0.5x RT	Best	~10GB

RT = Real-time (1x means transcribes at speaking speed)

Recommendation: Start with base for best speed/quality balance on CPU.

📁 Project Structure

locivox/
├── venv/                   # Virtual environment (created on setup)
├── src/
│   ├── __init__.py         # Package init
│   ├── cli.py              # Main CLI entry point
│   ├── audio_capture.py    # Microphone recording
│   ├── transcriber.py      # STT engine wrappers
│   └── utils.py            # Helper functions
├── output/                 # Generated transcripts
├── logs/                   # Application logs
├── models/                 # Downloaded models (auto-created)
├── config.yaml             # User configuration
├── requirements.txt        # Python dependencies
└── README.md               # This file

🛠️ Troubleshooting

"No audio devices found"

# List available devices
python -c "import sounddevice; print(sounddevice.query_devices())"

"FFmpeg not found"

Ensure FFmpeg is installed and in your PATH:

ffmpeg -version

Slow transcription on CPU

Use faster-whisper engine (2-4x faster than openai-whisper)
Use smaller models (tiny/base)
Reduce chunk duration in config

Import errors

Make sure virtual environment is activated:

# Check if venv is active (should show venv path)
which python  # macOS/Linux
where python  # Windows

🗺️ Roadmap

Phase 1: MVP CLI (You are here!)
Phase 2: Real-time streaming with chunked processing
Phase 3: Enhanced CLI with speaker diarization, multiple formats
Phase 4: GUI Desktop App with Electron/PyQt
Phase 5: Advanced features (translation, punctuation, custom vocabulary)
Phase 6: Multi-platform distribution with installers

See ROADMAP.md for detailed timeline.

🤝 Contributing

Contributions welcome! This is an open-source project.

Areas to contribute:

New STT engine integrations (Vosk, Coqui, wav2vec2)
Performance optimizations
GUI development
Documentation improvements
Bug fixes and testing

📄 License

MIT License - See LICENSE file

🙏 Acknowledgments

OpenAI Whisper - State-of-the-art STT model
Faster-Whisper - Optimized inference engine
sounddevice - Python audio library

📞 Support

Issues: Open an issue on GitHub
Discussions: Start a discussion for features/ideas
Logs: Check logs/locivox.log for debugging

Built with ❤️ for privacy-conscious developers

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.1

Feb 17, 2026

0.4.0

Feb 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

locivox-0.4.1.tar.gz (77.7 kB view details)

Uploaded Feb 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

locivox-0.4.1-py3-none-any.whl (70.8 kB view details)

Uploaded Feb 17, 2026 Python 3

File details

Details for the file locivox-0.4.1.tar.gz.

File metadata

Download URL: locivox-0.4.1.tar.gz
Upload date: Feb 17, 2026
Size: 77.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for locivox-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`a42dd8a583dcd6beea8299b91ac1ff2bfeb1544bbbb8c0096f7afca7c588a258`
MD5	`dfd01bcaef2a69366da98a52174d8d45`
BLAKE2b-256	`3cfa1462e77ab9ddea54c2ab411e7686f560cda91057c98bdf6614dfb4462763`

See more details on using hashes here.

Provenance

The following attestation bundles were made for locivox-0.4.1.tar.gz:

Publisher: release.yml on mudaye/locivox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: locivox-0.4.1.tar.gz
- Subject digest: a42dd8a583dcd6beea8299b91ac1ff2bfeb1544bbbb8c0096f7afca7c588a258
- Sigstore transparency entry: 956315855
- Sigstore integration time: Feb 17, 2026
Source repository:
- Permalink: mudaye/locivox@fb6ef03f2a22640abd472e457ea5e93f92755db5
- Branch / Tag: refs/tags/v0.4.1
- Owner: https://github.com/mudaye
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@fb6ef03f2a22640abd472e457ea5e93f92755db5
- Trigger Event: push

File details

Details for the file locivox-0.4.1-py3-none-any.whl.

File metadata

Download URL: locivox-0.4.1-py3-none-any.whl
Upload date: Feb 17, 2026
Size: 70.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for locivox-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3f7f2129d8639a1b94619ea0ec208692124adfa5b959eeca94a688cc9681f958`
MD5	`8d6754521e2c2691343ea91d9e710beb`
BLAKE2b-256	`f9f8b9dec5a750c30e36db3fca9342e7be9eab3d976cd532fd348d0c783f3de6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for locivox-0.4.1-py3-none-any.whl:

Publisher: release.yml on mudaye/locivox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: locivox-0.4.1-py3-none-any.whl
- Subject digest: 3f7f2129d8639a1b94619ea0ec208692124adfa5b959eeca94a688cc9681f958
- Sigstore transparency entry: 956315858
- Sigstore integration time: Feb 17, 2026
Source repository:
- Permalink: mudaye/locivox@fb6ef03f2a22640abd472e457ea5e93f92755db5
- Branch / Tag: refs/tags/v0.4.1
- Owner: https://github.com/mudaye
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@fb6ef03f2a22640abd472e457ea5e93f92755db5
- Trigger Event: push

locivox 0.4.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🎤 Locivox

✨ Features (Phase 1 - MVP)

🚀 Quick Start

Prerequisites

Installation

💻 Usage

Interactive Recording Mode

Transcribe Existing Audio File

Advanced Options

⚙️ Configuration

Model Sizes & Performance

📁 Project Structure

🛠️ Troubleshooting

"No audio devices found"

"FFmpeg not found"

Slow transcription on CPU

Import errors

🗺️ Roadmap

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance