Skip to main content

A real-time audio translator using Azure Cognitive Services and PyQt5

Project description

Real-Time Live Translator with PyQt GUI

This is a real-time live translator application built using Python, PyQt5, and Azure Cognitive Services. The application captures audio from the Windows system speakers using WASAPI loopback, translates the captured audio into a specified language, and displays the translated text in a GUI overlay that stays on top of all applications.

Features

  • Real-Time Translation: Captures and translates audio in real-time.
  • Subtitle Overlay: Displays translated text as subtitles over all applications.
  • Configurable Input/Output Languages: Users can specify input and output languages.
  • Audio Device Selection: Users can choose from available audio devices.
  • Configuration Persistence: Saves user preferences for input/output languages, selected audio device, and Azure settings.
  • Customization Options: Customize the subtitle overlay's font, color, and appearance.
  • Settings Management: Reset settings to default, import/export settings.

Requirements

  • pyenv
  • pyenv-virtualenv
  • A valid Azure Cognitive Services subscription key and service region.

Installation

Install pyenv and pyenv-virtualenv

Follow the instructions here to install pyenv.

Additionally, install pyenv-virtualenv by following the instructions here.

Clone this repository

git clone https://github.com/jpshag/real-time-translator.git
cd real-time-translator

Set up a Python version with pyenv

Install a specific Python version (e.g., 3.8.10):

pyenv install 3.8.10

Set the local Python version for this project:

pyenv local 3.8.10

Create and activate a virtual environment with pyenv-virtualenv

Create a virtual environment:

pyenv virtualenv 3.8.10 translator-env

Activate the virtual environment:

pyenv activate translator-env

Install the required packages

If a requirements.txt is provided:

pip install -r requirements.txt

If requirements.txt is not provided, manually install the dependencies:

pip install pyaudio numpy scipy azure-cognitiveservices-speech PyQt5

Configuration

Obtain an Azure Cognitive Services subscription key and service region. You can create a free account here.

Create a config.json file with your settings:

{
    "input_language": "en-US",
    "output_language": "es-ES",
    "audio_device_index": 0,
    "subtitle_font": "Arial,24",
    "subtitle_color": "#FFFFFF",
    "azure_subscription_key": "YourAzureSubscriptionKey",
    "azure_region": "YourServiceRegion"
}

Usage

Run the application:

python main.py

Configure the input language, output language, and audio device using the GUI.

Click the "Start Translation" button to begin capturing and translating audio.

The translated text will be displayed in the GUI and as a subtitle overlay on top of all applications.

Click the "Stop Translation" button to stop the translation process.

To customize the subtitle overlay, click the "Settings" button and adjust the font, color, and other options as desired.

Known Issues

  • Ensure the correct audio device supporting WASAPI loopback is selected.
  • The application currently supports only Windows due to the use of WASAPI for audio capture.

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any changes or improvements.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

real_time_translator-0.1.2.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

real_time_translator-0.1.2-py3-none-any.whl (3.1 kB view details)

Uploaded Python 3

File details

Details for the file real_time_translator-0.1.2.tar.gz.

File metadata

  • Download URL: real_time_translator-0.1.2.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for real_time_translator-0.1.2.tar.gz
Algorithm Hash digest
SHA256 8855324291389340a3e1d7eef699e17647d61f45d486711ca462d3beab26c256
MD5 1867aa57756d11734fe1608779088e40
BLAKE2b-256 4da663750cb4f37479e5de183da0b73756fe51b19ffc9454fb8c9cf94334e7b5

See more details on using hashes here.

File details

Details for the file real_time_translator-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for real_time_translator-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d2aa6dbc11cbe15864eaf40afceff7259a1979fd0a05e82256697b5b307056b5
MD5 c9f661f7239d2554191a8255a1c2e2bd
BLAKE2b-256 fbdb85715f79067de4d39252b347f73d5afda9a7640905d8fc83c5dc14fc77c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page