Skip to main content

D-Bus service providing speech-to-text functionality for GNOME Shell

Project description

GNOME Speech2Text Service

A D-Bus service that provides speech-to-text functionality for the GNOME Shell Speech2Text extension.

Overview

This service handles the actual speech recognition processing using OpenAI's Whisper API. It runs as a D-Bus service and communicates with the GNOME Shell extension to provide seamless speech-to-text functionality.

Features

  • Real-time speech recognition using OpenAI Whisper
  • D-Bus integration for seamless desktop integration
  • Audio recording with configurable duration
  • Multiple output modes (clipboard, text insertion, preview)
  • Error handling and recovery
  • Session management for multiple concurrent recordings

Installation

System Dependencies

This service requires several system packages to be installed:

# Ubuntu/Debian
sudo apt update && sudo apt install -y \
    python3 python3-pip python3-venv python3-dbus python3-gi \
    ffmpeg xdotool xclip wl-clipboard

Service Installation

The service can be installed via pip:

pip install gnome-speech2text-service

Or from the source repository:

cd service/
pip install .

D-Bus Registration

After installation, you need to register the D-Bus service:

# Run the provided install script
./install.sh

This will:

  • Set up the Python virtual environment
  • Install the service in the correct location
  • Register the D-Bus service files
  • Configure the desktop integration

Usage

Starting the Service

The service is automatically started by D-Bus when needed. You can also start it manually:

gnome-speech2text-service

Configuration

The service uses OpenAI's API for speech recognition. You'll need to:

  1. Get an OpenAI API key from OpenAI Platform
  2. Configure it through the GNOME Shell extension preferences

D-Bus Interface

The service provides the following D-Bus methods:

  • StartRecording(duration, copy_to_clipboard, preview_mode)recording_id
  • StopRecording(recording_id)success
  • GetRecordingStatus(recording_id)status, progress
  • CancelRecording(recording_id)success

Signals:

  • TranscriptionReady(recording_id, text)
  • RecordingProgress(recording_id, progress)
  • RecordingError(recording_id, error_message)

Development

Local Development

# Clone the repository
git clone https://github.com/kavehtehrani/gnome-speech2text.git
cd gnome-speech2text/service/

# Install in development mode
pip install -e .

# Run the service
gnome-speech2text-service

Testing

# Install development dependencies
pip install -e .[dev]

# Run tests
pytest

Requirements

  • Python: 3.8 or higher
  • System: Linux with D-Bus support
  • Desktop: GNOME Shell (tested on GNOME 46+)
  • API: OpenAI API key for speech recognition

License

This project is licensed under the GPL-2.0-or-later license. See the LICENSE file for details.

Contributing

Contributions are welcome! Please see the main repository for contribution guidelines: https://github.com/kavehtehrani/gnome-speech2text

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gnome_speech2text_service-1.0.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gnome_speech2text_service-1.0.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file gnome_speech2text_service-1.0.0.tar.gz.

File metadata

File hashes

Hashes for gnome_speech2text_service-1.0.0.tar.gz
Algorithm Hash digest
SHA256 6a5c7bd6148f643913ec3b5abd1b257f6f06e67c9386900631b19138a39af68b
MD5 0e6036de1a49fdbab22b7306407663d8
BLAKE2b-256 5d3dc8ec0ee1447a3165c1a1a5c1776fb68affdec89801b5a33d5e8839c0dca7

See more details on using hashes here.

File details

Details for the file gnome_speech2text_service-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for gnome_speech2text_service-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e2c9ad1ac7452f93483be932742bf2398e722749274efcb633cbded3bafa7ceb
MD5 cbadebdf3e9586f0ace0b5a7a3b6ad1c
BLAKE2b-256 56c9ca392be731af4dc56015459ea6112ea41b3e1b4ae470b6e42fbb9aba8f29

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page