Hardware-agnostic middleware connecting sEMG silent-speech interfaces to LLM agents

These details have not been verified by PyPI

Project links

Project description

Subvocal SDK: Physiological Silent Speech Interface Middleware

The Subvocal SDK is an open-source, hardware-agnostic middleware platform that connects surface electromyography (sEMG) interfaces to LLM-driven AI agents.

Rather than locking developers to a proprietary neckband or a closed whole-word vocabulary, the Subvocal SDK provides the software rails—signal conditioning, deep learning training skeletons, articulatory phonetic shorthand simulators, and context-aware decoders—to enable high-accuracy, low-latency, and open-vocabulary silent speech control.

🛠️ Installation

pip install subvocal

The base install is lightweight (pydantic + numpy) and covers the pipeline, hardware drivers, shorthand decoding, context, and the MCP server. Optional extras pull in heavier subsystems:

Extra	Enables	Installs
`subvocal[ml]`	Classifier training, inference, calibration (`subvocal.emg_core`)	scipy, scikit-learn, joblib, torch
`subvocal[hardware]`	Public-dataset drivers (Ninapro, PutEMG, CSL-HDEMG)	scipy, h5py
`subvocal[tts]`	Audio feedback outside macOS	pyttsx3
`subvocal[export]`	ONNX model export	onnx
`subvocal[all]`	Everything above	—

🚀 Quickstart

A complete pipeline—synthetic sEMG source through intent reconstruction to action execution—runs offline in a few lines:

from subvocal import SubvocalPipeline
from subvocal.core.testing import MockActionExecutor, MockContextProvider, MockLLMProvider
from subvocal.hardware.drivers import SyntheticSignalGenerator
from subvocal.core.models import CommandToken
import time

hardware = SyntheticSignalGenerator(fs=1000.0, num_channels=8)

def classify(frame):
    """Replace with subvocal.emg_core.ml.infer.InferenceEngine for real models."""
    arr = frame.to_numpy()
    if abs(arr).max() > 1.0:  # a command burst is present
        return CommandToken(text="gt", confidence=0.95, timestamp=time.time())
    return None

pipeline = SubvocalPipeline(
    hardware=hardware,
    classify_fn=classify,
    llm_provider=MockLLMProvider(),       # or resolve_provider() / ClaudeProvider() ...
    context_provider=MockContextProvider(),
    executor=MockActionExecutor(),
    phrase_timeout_seconds=0.5,
    on_action=lambda action, status: print("observed:", action.action_type, status),
)

hardware.start()
hardware.trigger_command("gt", duration_ms=120)
for _ in range(30):
    action = pipeline.step(window_ms=50)
    if action:
        print("Executed:", action.action_type, action.params)
        # -> Executed: goto {'arguments': ['google.com'], 'resolved_text': 'GOTO google.com', ...}
        break
    time.sleep(0.05)  # real-time pacing: the phrase ends after 0.5 s of silence

Swap in a real LLM provider (subvocal.core.llm_providers.ClaudeProvider, OpenAIProvider, GeminiProvider, LlamaProvider), a real driver (OpenBCICytonDriver, DelsysTrignoDriver, FileReplayDriver), and a trained classifier (subvocal.emg_core.ml.infer.InferenceEngine) without changing the pipeline code. subvocal.resolve_provider() picks the best provider for the environment automatically — a real LLM when an API key is present, the offline HeuristicProvider otherwise.

Production behavior

Typed errors: everything the SDK raises derives from subvocal.SubvocalError (HardwareError, ProviderError, ConfigurationError, PolicyViolationError, ...), each compatible with the builtin exception type it replaces.
Resilient providers: configurable per-request timeouts and exponential-backoff retries for transient failures (connection errors, HTTP 408/429/5xx); non-retryable statuses fail fast.
Observability: pipeline.stats exposes running counters (frames, tokens, intents, executed/blocked actions, errors, uptime), and on_token / on_intent / on_action / on_error observer callbacks stream pipeline lifecycle events without ever breaking the pipeline. Every phrase is JSONL-traced for audit.
Safety: pluggable policy engine with dry-run mode; set raise_on_policy_violation=True to turn rejections into PolicyViolationError.

MCP server

The SDK ships a stdio Model Context Protocol server so Claude Desktop (or any MCP client) can ingest subvocal commands as tools:

subvocal-mcp

Claude Desktop config:

{
  "mcpServers": {
    "subvocal": { "command": "subvocal-mcp" }
  }
}

📂 Repository Structure

subvocal/
├── src/subvocal/           # The installable package
│   ├── core/               # Data models, interfaces, pipeline, security policies, LLM providers
│   ├── hardware/           # HAL drivers (file replay, synthetic, OpenBCI, Delsys) + dataset loaders
│   ├── emg_core/           # DSP filters, TD10 features, classifiers (RF/CNN/GRU/Transformer)
│   ├── shorthand/          # Phonetic shorthand vocabulary, simulator, hybrid decoder
│   ├── context/            # User context schemas and phonetic context matching
│   ├── mcp/                # Model Context Protocol stdio server
│   └── tts/                # Multi-backend TTS feedback engine
├── tests/                  # Pytest suite
├── benchmarks/             # 50-case intent-reconstruction eval harnesses
├── tools/                  # Site/API-page builders, license audit, benchmark runner
└── docs/                   # GitHub Pages site (landing, docs, platform corpus, API reference)
    └── content/            # Markdown sources for the platform corpus and walkthrough

🚀 Core Features

Articulatory Shorthand Decoder: Overcomes the whole-word sEMG vocabulary ceiling. Decodes compressed phonetic consonant shorthand inputs (e.g. g gl -> Google) under heavy muscle-movement noise.
Asymmetric Levenshtein Distance: A dynamic programming string alignment cost matrix configured with physiological sEMG confusion clusters (Glottal, Labial, Alveolar, Velar, Rhotic) to discount vowel/consonant omissions in silent speech.
Command-Aware Context Prioritization: Dynamic target matching against active user contacts (TYPE), calendar events (SEARCH), browser URLs (GOTO), and active application screen elements (CLICK).
Physiological Signal Conditioning: Preprocessing filter configurations defaulting to AlterEgo's 1.3–50.0 Hz bandpass filter (designed for low-velocity articulatory gestures) with configuration support for standard 20.0–450.0 hz EMG.
Classifiers (RF + Deep Learning): Custom pipelines to train scikit-learn Random Forest, PyTorch 1D CNN, GRU, and Transformer architectures on raw multi-channel sEMG traces.
Asynchronous Execution (V2 Architecture): Low-latency, thread-safe asynchronous pipeline orchestration built on LiveKit's OpsQueue and IncrementalDispatcher design.
Physiological Signal Monitoring: Real-time EMA-smoothed signal level activity detection and MOS-like connection quality scoring (evaluating saturation, drift, and dropouts).
Prometheus Telemetry: Integrated Prometheus metric exporter and pre-built Grafana monitoring dashboards for tracking SDK errors, session lifecycles, and action execution statistics.
HMAC-Signed Capability Grants: Secure token-based credentials (ActionGrants) specifying allowed command scopes, confidence thresholds, and dry-run policies, verified dynamically via the GrantsPolicy middleware.
MCP Integration: A zero-dependency stdio JSON-RPC server exposing pipeline status, token injection, phrase processing, and calibration as MCP tools.
Persistent Session Storage: SQLite and in-memory backends to serialize and reload session configurations, states, and active metrics.
Real-Time TCP Biometric Streaming: Dedicated TCP socket server broadcasting live signal attributes (quality, levels, tokens) to visualization dashboards.
Ingress/Egress Orchestration: Ingress failover management for primary sensors and simulation streams; egress dispatcher for audio TTS and dataset logs.
Intelligent Node Routing: Load-balanced session assignment based on CPU metrics or active session counts using selectors.
Zero-Dependency BrainFlow & DSP: A pure-Python fallback for the BrainFlow SDK. Seamlessly emulates SYNTHETIC_BOARD and OpenBCI CYTON_BOARD (via direct serial packet parsing) and recreates the DataFilter signal processing API (filtering, windowing, Welch PSD estimation, bandpower) without requiring native C++ binary dependencies.

🧪 Development

git clone https://github.com/PranavKalkunte/subvocal.git
cd subvocal
pip install -e ".[all,dev]"

pytest                      # test suite
ruff check src tests       # lint
pyright                     # type check
python benchmarks/eval_runner.py   # 50-case heuristic benchmark

Runtime artifacts (traces, trained models) are written to the per-user data directory; override with SUBVOCAL_DATA_DIR / SUBVOCAL_MODELS_DIR.

Contributions are welcome — see CONTRIBUTING.md for the workflow and quality gates, and SECURITY.md for vulnerability reporting.

📄 License

This repository is open-sourced under the MIT License. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.0

Jun 9, 2026

1.0.0rc1 pre-release

Jun 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subvocal-2.0.0.tar.gz (305.4 kB view details)

Uploaded Jun 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

subvocal-2.0.0-py3-none-any.whl (117.0 kB view details)

Uploaded Jun 9, 2026 Python 3

File details

Details for the file subvocal-2.0.0.tar.gz.

File metadata

Download URL: subvocal-2.0.0.tar.gz
Upload date: Jun 9, 2026
Size: 305.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for subvocal-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`c44725c527f111eceea76447fadeb092b363bf3ea41b1dd3e4b30030e7245d18`
MD5	`6c05d2badfcbf23383f1fed90f63d9a2`
BLAKE2b-256	`63598fbe64535f8dbf59900a824f1e283a821d731c4080041462c8aae52e71fa`

See more details on using hashes here.

File details

Details for the file subvocal-2.0.0-py3-none-any.whl.

File metadata

Download URL: subvocal-2.0.0-py3-none-any.whl
Upload date: Jun 9, 2026
Size: 117.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for subvocal-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`769b7529588e40be4909f34da0b43327976255aef63a6bbb9c7a7c6cb7268765`
MD5	`9b98ba93db12ab162bcfff51dfaffad2`
BLAKE2b-256	`15c1775c89d84593e6d33f28652cefa9e00c3d89eda24a6a69134ce98caa8135`

See more details on using hashes here.

subvocal 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Subvocal SDK: Physiological Silent Speech Interface Middleware

🛠️ Installation

🚀 Quickstart

Production behavior

MCP server

📂 Repository Structure

🚀 Core Features

🧪 Development

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes