Add your description here
Project description
Speak Now
A locally-hosted, low-latency speech-to-text solution with LLM integration.
Overview
Speak Now captures your speech in real-time and allows you to paste it as text with optional LLM-based formatting. It's designed to be lightweight and efficient for everyday use.
Features
- Real-time transcription using local speech recognition
- Keyboard shortcuts for quick actions:
- Toggle recording:
Ctrl+Alt+Space - Paste raw text:
Ctrl+ - Format and paste:
Alt+
- Toggle recording:
- Text formatting via Google's Gemini API
- Format options include:
- Natural - smooths out transcription
- Formal - professional language
- Concise - preserves key information while reducing length
- Catgirl - adds a playful style (example custom format)
- None - no formatting
- Simple GUI for monitoring status and selecting format options
Setup
- Clone this repository
- Install dependencies:
pip install -e . - Set up your Gemini API key in
stt_config.tomlor as environment variable - Run the application:
python stt_cache_v2.py
Configuration
A default config file will be generated on first run. You can customize:
- API settings (Gemini key, model)
- Speech-to-text model and options
- Keyboard shortcuts
- UI settings
- Formatting prompts
Current Status
This project is a work in progress. Basic functionality is implemented but you may encounter bugs or limitations. Contributions and feedback are welcome!
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file speak_now-0.1.0.tar.gz.
File metadata
- Download URL: speak_now-0.1.0.tar.gz
- Upload date:
- Size: 168.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7cb00951d4450af5a79fb692823eccc85389a59b833cc4d0a5d301fc77e74fad
|
|
| MD5 |
2d408badd4122c014ffb4af4be27f39b
|
|
| BLAKE2b-256 |
cd1f0aa8dfeaf5c8afe796225a31abadeed2d5296d8ac139fe9a8793afac1ab0
|
File details
Details for the file speak_now-0.1.0-py3-none-any.whl.
File metadata
- Download URL: speak_now-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18589ae6f464984ed9b3ffc9335e10ce7b66e83accf218ad3a920516d094785f
|
|
| MD5 |
4a5389ec820310d00bf5eab1334cfbc7
|
|
| BLAKE2b-256 |
622441e11f3a15d5bbd96fc4e60611315939522d0e5ef15ebbc1c02b26939cad
|