Skip to main content

D-Bus service providing speech-to-text functionality for GNOME Shell

Project description

Speech2Text Service

A D-Bus service that provides speech-to-text functionality for the GNOME Shell Speech2Text extension.

Overview

This service handles the actual speech recognition processing using OpenAI's Whisper model locally. It runs as a D-Bus service and communicates with the GNOME Shell extension to provide seamless speech-to-text functionality.

Features

  • Real-time speech recognition using OpenAI Whisper
  • D-Bus integration for seamless desktop integration
  • Audio recording with configurable duration
  • Multiple output modes (clipboard, text insertion, preview)
  • Error handling and recovery
  • Session management for multiple concurrent recordings

Installation

System Dependencies

This service requires several system packages to be installed (e.g. ffmpeg, clipboard tools). See the main README.md for the complete list of system dependencies.

Service Installation

The service is available on PyPI and is typically installed into a per-user virtual environment by the extension’s installer.

pip install speech2text-extension-service

PyPI Package: speech2text-extension-service

Or from the source repository:

cd service/
pip install .

D-Bus Registration

After installation, you need to register the D-Bus service and desktop entry. Recommended options:

  1. Using the repository (local source install)
# From the repo root
./service/install-service.sh --local
  1. Using the bundled installer (PyPI install)
# From the repo root
./service/install-service.sh --pypi

The installer will:

  • Create a per-user virtual environment under ~/.local/share/speech2text-extension-service/venv
  • Install the speech2text-extension-service package
  • Register the D-Bus service at ~/.local/share/dbus-1/services/org.gnome.Shell.Extensions.Speech2Text.service
  • Create a desktop entry at ~/.local/share/applications/speech2text-extension-service.desktop

Usage

Starting the Service

The service is D-Bus activated and starts automatically when requested by the extension. You can also start it manually:

# If the entry point is on PATH (pip install)
speech2text-extension-service

# Or via the per-user wrapper created by the installer
~/.local/share/speech2text-extension-service/speech2text-extension-service

Configuration

The service uses OpenAI's Whisper model locally for speech recognition. No API key is required. All processing happens on your local machine for complete privacy.

D-Bus Interface

The service provides the following D-Bus interface (stable; used by the GNOME extension):

Methods:

  • StartRecording(duration, copy_to_clipboard, preview_mode)recording_id
  • StopRecording(recording_id)success
  • CancelRecording(recording_id)success
  • TypeText(text, copy_to_clipboard)success
  • GetServiceStatus()status
  • CheckDependencies()all_available, missing_dependencies[]

Signals:

  • RecordingStarted(recording_id)
  • RecordingStopped(recording_id, reason)
  • TranscriptionReady(recording_id, text)
  • RecordingError(recording_id, error_message)
  • TextTyped(text, success)

Requirements

  • Python: 3.8–3.13 (Python 3.14+ not supported yet)
  • System: Linux with D-Bus support
  • Desktop: GNOME Shell (tested on GNOME 46+)

License

This project is licensed under the GPL-3.0-or-later license. See the LICENSE file for details.

Contributing

Contributions are welcome! Please see the main repository for contribution guidelines: https://github.com/kavehtehrani/speech2text-extension

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speech2text_extension_service-1.2.0.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

speech2text_extension_service-1.2.0-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file speech2text_extension_service-1.2.0.tar.gz.

File metadata

File hashes

Hashes for speech2text_extension_service-1.2.0.tar.gz
Algorithm Hash digest
SHA256 3471a057c22f90c5723a9a27d31747d51d926d36c6d35471b92a9578c1df4b18
MD5 a96e276a10b7c59cb12ac03515d449ef
BLAKE2b-256 9488006901bdf82e9cd2e0959d6b5fb3c2f5c726bd2a63d005b3e24792b9da03

See more details on using hashes here.

File details

Details for the file speech2text_extension_service-1.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for speech2text_extension_service-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4d73f2eac90784176186425c4f0a641e935143c2515adb47d88d93ef3aa14fd
MD5 3610f05d908963b703683ac4bb0d6e92
BLAKE2b-256 4310894ba4b61b4ca5278e31001d7759e1d62e5af65ab2597f5b1831b433f94c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page