Skip to main content

D-Bus service providing speech-to-text functionality for GNOME Shell

Project description

GNOME Speech2Text Service

A D-Bus service that provides speech-to-text functionality for the GNOME Shell Speech2Text extension.

Overview

This service handles the actual speech recognition processing using OpenAI's Whisper API. It runs as a D-Bus service and communicates with the GNOME Shell extension to provide seamless speech-to-text functionality.

Features

  • Real-time speech recognition using OpenAI Whisper
  • D-Bus integration for seamless desktop integration
  • Audio recording with configurable duration
  • Multiple output modes (clipboard, text insertion, preview)
  • Error handling and recovery
  • Session management for multiple concurrent recordings

Installation

System Dependencies

This service requires several system packages to be installed:

# Ubuntu/Debian
sudo apt update && sudo apt install -y \
    python3 python3-pip python3-venv python3-dbus python3-gi \
    ffmpeg xdotool xclip wl-clipboard

Service Installation

The service can be installed via pip:

pip install gnome-speech2text-service

Or from the source repository:

cd service/
pip install .

D-Bus Registration

After installation, you need to register the D-Bus service and desktop entry. Recommended options:

  1. Using the repository (local source install)
# From the repo root
./src/install-service.sh --local
  1. Using the bundled installer (PyPI install)
# From the repo root
./src/install-service.sh --pypi

The installer will:

  • Create a per-user virtual environment under ~/.local/share/gnome-speech2text-service/venv
  • Install the gnome-speech2text-service package
  • Register the D-Bus service at ~/.local/share/dbus-1/services/org.gnome.Speech2Text.service
  • Create a desktop entry at ~/.local/share/applications/gnome-speech2text-service.desktop

Usage

Starting the Service

The service is D-Bus activated and starts automatically when requested by the extension. You can also start it manually:

# If the entry point is on PATH (pip install)
gnome-speech2text-service

# Or via the per-user wrapper created by the installer
~/.local/share/gnome-speech2text-service/gnome-speech2text-service

Configuration

The service uses OpenAI's API for speech recognition. You'll need to:

  1. Get an OpenAI API key from OpenAI Platform
  2. Configure it through the GNOME Shell extension preferences

D-Bus Interface

The service provides the following D-Bus methods:

  • StartRecording(duration, copy_to_clipboard, preview_mode)recording_id
  • StopRecording(recording_id)success
  • GetRecordingStatus(recording_id)status, progress
  • CancelRecording(recording_id)success

Signals:

  • TranscriptionReady(recording_id, text)
  • RecordingProgress(recording_id, progress)
  • RecordingError(recording_id, error_message)

Requirements

  • Python: 3.8 or higher
  • System: Linux with D-Bus support
  • Desktop: GNOME Shell (tested on GNOME 46+)

License

This project is licensed under the GPL-2.0-or-later license. See the LICENSE file for details.

Contributing

Contributions are welcome! Please see the main repository for contribution guidelines: https://github.com/kavehtehrani/gnome-speech2text

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gnome_speech2text_service-1.0.3.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gnome_speech2text_service-1.0.3-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file gnome_speech2text_service-1.0.3.tar.gz.

File metadata

File hashes

Hashes for gnome_speech2text_service-1.0.3.tar.gz
Algorithm Hash digest
SHA256 c462e7dd3f93947c1b69742545074fb0e45654a09dc21e74847d51c09038938d
MD5 9f733cfaa493b15c37283ea4d8e2b406
BLAKE2b-256 c8b0d9d9acefdf4247d344b431a648a0be7420b89dd170cf7ed4313d2f3a719a

See more details on using hashes here.

File details

Details for the file gnome_speech2text_service-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for gnome_speech2text_service-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 cb6c08e6b53b9402ba81fecb1b38dea6b2da84d0a339b6646026c7525716528d
MD5 7394af4f1c578268949af78be620ef0c
BLAKE2b-256 e4c8dd6c432309ff4e961bbc200c69430eb782aefcfc5b798fa85d51bacddfa5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page