A real-time audio translator using Azure Cognitive Services and PyQt5
Project description
Real-Time Live Translator with PyQt GUI
This is a real-time live translator application built using Python, PyQt5, and Azure Cognitive Services. The application captures audio from the Windows system speakers using WASAPI loopback, translates the captured audio into a specified language, and displays the translated text in a GUI overlay that stays on top of all applications.
Features
- Real-Time Translation: Captures and translates audio in real-time.
- Subtitle Overlay: Displays translated text as subtitles over all applications.
- Configurable Input/Output Languages: Users can specify input and output languages.
- Audio Device Selection: Users can choose from available audio devices.
- Configuration Persistence: Saves user preferences for input/output languages, selected audio device, and Azure settings.
- Customization Options: Customize the subtitle overlay's font, color, and appearance.
- Settings Management: Reset settings to default, import/export settings.
Requirements
pyenv
pyenv-virtualenv
- A valid Azure Cognitive Services subscription key and service region.
Installation
-
Install
pyenv
andpyenv-virtualenv
:Follow the instructions here to install
pyenv
.Additionally, install
pyenv-virtualenv
by following the instructions here. -
Clone this repository:
git clone https://github.com/yourusername/repo-name.git cd repo-name
-
Set up a Python version with
pyenv
:Install a specific Python version (e.g., 3.8.10):
pyenv install 3.8.10
Set the local Python version for this project:
pyenv local 3.8.10
-
Create and activate a virtual environment with
pyenv-virtualenv
:Create a virtual environment:
pyenv virtualenv 3.8.10 translator-env
Activate the virtual environment:
pyenv activate translator-env
-
Install the required packages:
pip install -r requirements.txt
If
requirements.txt
is not provided, you can manually install the dependencies:pip install pyaudio numpy scipy azure-cognitiveservices-speech PyQt5
Configuration
-
Obtain an Azure Cognitive Services subscription key and service region. You can create a free account here.
-
Create a
config.json
file with your settings:{ "input_language": "en-US", "output_language": "es-ES", "audio_device_index": 0, "subtitle_font": "Arial,24", "subtitle_color": "#FFFFFF", "azure_subscription_key": "YourAzureSubscriptionKey", "azure_region": "YourServiceRegion" }
Usage
-
Run the application:
python main.py
-
Configure the input language, output language, and audio device using the GUI.
-
Click the "Start Translation" button to begin capturing and translating audio.
-
The translated text will be displayed in the GUI and as a subtitle overlay on top of all applications.
-
Click the "Stop Translation" button to stop the translation process.
-
To customize the subtitle overlay, click the "Settings" button and adjust the font, color, and other options as desired.
Known Issues
- Ensure the correct audio device supporting WASAPI loopback is selected.
- The application currently supports only Windows due to the use of WASAPI for audio capture.
Contributing
Contributions are welcome! Please open an issue or submit a pull request for any changes or improvements.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for real_time_translator-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f41ed7ad008c175872520f0f83a173fc0956fca9468942f12e2ec2a5905b894c |
|
MD5 | d51fed5a0485cf4658951d7b73d261d2 |
|
BLAKE2b-256 | 27bc3fe6be0090b0438cec0eb15133e7c9a6f05c30f5ed03c8a52642082e15a9 |
Hashes for real_time_translator-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4287c0ec3dbcbf13278e962d5cf839c7eee44c25179b0f00c3aad43970c44870 |
|
MD5 | d28796180670411ba7ae8512d7a1ac7f |
|
BLAKE2b-256 | 8d88769e33a730b06e4e7c3fe17af4f696bda480cf75d7d5b063ce99dfef6830 |