Hardware-agnostic middleware connecting sEMG silent-speech interfaces to LLM agents
Project description
Subvocal SDK: Physiological Silent Speech Interface Middleware
The Subvocal SDK is an open-source, hardware-agnostic middleware platform that connects surface electromyography (sEMG) interfaces to LLM-driven AI agents.
Rather than locking developers to a proprietary neckband or a closed whole-word vocabulary, the Subvocal SDK provides the software rails—signal conditioning, deep learning training skeletons, articulatory phonetic shorthand simulators, and context-aware decoders—to enable high-accuracy, low-latency, and open-vocabulary silent speech control.
🛠️ Installation
pip install subvocal
The base install is lightweight (pydantic + numpy) and covers the pipeline, hardware drivers, shorthand decoding, context, and the MCP server. Optional extras pull in heavier subsystems:
| Extra | Enables | Installs |
|---|---|---|
subvocal[ml] |
Classifier training, inference, calibration (subvocal.emg_core) |
scipy, scikit-learn, joblib, torch |
subvocal[hardware] |
Public-dataset drivers (Ninapro, PutEMG, CSL-HDEMG) | scipy, h5py |
subvocal[tts] |
Audio feedback outside macOS | pyttsx3 |
subvocal[export] |
ONNX model export | onnx |
subvocal[all] |
Everything above | — |
🚀 Quickstart
A complete pipeline—synthetic sEMG source through intent reconstruction to action execution—runs offline in a few lines:
from subvocal import SubvocalPipeline
from subvocal.core.testing import MockActionExecutor, MockContextProvider, MockLLMProvider
from subvocal.hardware.drivers import SyntheticSignalGenerator
from subvocal.core.models import CommandToken
import time
hardware = SyntheticSignalGenerator(fs=1000.0, num_channels=8)
def classify(frame):
"""Replace with subvocal.emg_core.ml.infer.InferenceEngine for real models."""
arr = frame.to_numpy()
if abs(arr).max() > 1.0: # a command burst is present
return CommandToken(text="gt", confidence=0.95, timestamp=time.time())
return None
pipeline = SubvocalPipeline(
hardware=hardware,
classify_fn=classify,
llm_provider=MockLLMProvider(), # or resolve_provider() / ClaudeProvider() ...
context_provider=MockContextProvider(),
executor=MockActionExecutor(),
phrase_timeout_seconds=0.5,
on_action=lambda action, status: print("observed:", action.action_type, status),
)
hardware.start()
hardware.trigger_command("gt", duration_ms=120)
for _ in range(30):
action = pipeline.step(window_ms=50)
if action:
print("Executed:", action.action_type, action.params)
# -> Executed: goto {'arguments': ['google.com'], 'resolved_text': 'GOTO google.com', ...}
break
time.sleep(0.05) # real-time pacing: the phrase ends after 0.5 s of silence
Swap in a real LLM provider (subvocal.core.llm_providers.ClaudeProvider, OpenAIProvider, GeminiProvider, LlamaProvider), a real driver (OpenBCICytonDriver, DelsysTrignoDriver, FileReplayDriver), and a trained classifier (subvocal.emg_core.ml.infer.InferenceEngine) without changing the pipeline code. subvocal.resolve_provider() picks the best provider for the environment automatically — a real LLM when an API key is present, the offline HeuristicProvider otherwise.
Production behavior
- Typed errors: everything the SDK raises derives from
subvocal.SubvocalError(HardwareError,ProviderError,ConfigurationError,PolicyViolationError, ...), each compatible with the builtin exception type it replaces. - Resilient providers: configurable per-request timeouts and exponential-backoff retries for transient failures (connection errors, HTTP 408/429/5xx); non-retryable statuses fail fast.
- Observability:
pipeline.statsexposes running counters (frames, tokens, intents, executed/blocked actions, errors, uptime), andon_token/on_intent/on_action/on_errorobserver callbacks stream pipeline lifecycle events without ever breaking the pipeline. Every phrase is JSONL-traced for audit. - Safety: pluggable policy engine with dry-run mode; set
raise_on_policy_violation=Trueto turn rejections intoPolicyViolationError.
MCP server
The SDK ships a stdio Model Context Protocol server so Claude Desktop (or any MCP client) can ingest subvocal commands as tools:
subvocal-mcp
Claude Desktop config:
{
"mcpServers": {
"subvocal": { "command": "subvocal-mcp" }
}
}
📂 Repository Structure
subvocal/
├── src/subvocal/ # The installable package
│ ├── core/ # Data models, interfaces, pipeline, security policies, LLM providers
│ ├── hardware/ # HAL drivers (file replay, synthetic, OpenBCI, Delsys) + dataset loaders
│ ├── emg_core/ # DSP filters, TD10 features, classifiers (RF/CNN/GRU/Transformer)
│ ├── shorthand/ # Phonetic shorthand vocabulary, simulator, hybrid decoder
│ ├── context/ # User context schemas and phonetic context matching
│ ├── mcp/ # Model Context Protocol stdio server
│ └── tts/ # Multi-backend TTS feedback engine
├── tests/ # Pytest suite
├── benchmarks/ # 50-case intent-reconstruction eval harnesses
├── tools/ # Site/API-page builders, license audit, benchmark runner
└── docs/ # GitHub Pages site (landing, docs, platform corpus, API reference)
└── content/ # Markdown sources for the platform corpus and walkthrough
🚀 Core Features
- Articulatory Shorthand Decoder: Overcomes the whole-word sEMG vocabulary ceiling. Decodes compressed phonetic consonant shorthand inputs (e.g.
g gl->Google) under heavy muscle-movement noise. - Asymmetric Levenshtein Distance: A dynamic programming string alignment cost matrix configured with physiological sEMG confusion clusters (Glottal, Labial, Alveolar, Velar, Rhotic) to discount vowel/consonant omissions in silent speech.
- Command-Aware Context Prioritization: Dynamic target matching against active user contacts (
TYPE), calendar events (SEARCH), browser URLs (GOTO), and active application screen elements (CLICK). - Physiological Signal Conditioning: Preprocessing filter configurations defaulting to AlterEgo's
1.3–50.0 Hzbandpass filter (designed for low-velocity articulatory gestures) with configuration support for standard20.0–450.0 hzEMG. - Classifiers (RF + Deep Learning): Custom pipelines to train scikit-learn Random Forest, PyTorch 1D CNN, GRU, and Transformer architectures on raw multi-channel sEMG traces.
- Asynchronous Execution (V2 Architecture): Low-latency, thread-safe asynchronous pipeline orchestration built on LiveKit's
OpsQueueandIncrementalDispatcherdesign. - Physiological Signal Monitoring: Real-time EMA-smoothed signal level activity detection and MOS-like connection quality scoring (evaluating saturation, drift, and dropouts).
- Prometheus Telemetry: Integrated Prometheus metric exporter and pre-built Grafana monitoring dashboards for tracking SDK errors, session lifecycles, and action execution statistics.
- HMAC-Signed Capability Grants: Secure token-based credentials (
ActionGrants) specifying allowed command scopes, confidence thresholds, and dry-run policies, verified dynamically via theGrantsPolicymiddleware. - MCP Integration: A zero-dependency stdio JSON-RPC server exposing pipeline status, token injection, phrase processing, and calibration as MCP tools.
- Persistent Session Storage: SQLite and in-memory backends to serialize and reload session configurations, states, and active metrics.
- Real-Time TCP Biometric Streaming: Dedicated TCP socket server broadcasting live signal attributes (quality, levels, tokens) to visualization dashboards.
- Ingress/Egress Orchestration: Ingress failover management for primary sensors and simulation streams; egress dispatcher for audio TTS and dataset logs.
- Intelligent Node Routing: Load-balanced session assignment based on CPU metrics or active session counts using selectors.
- Zero-Dependency BrainFlow & DSP: A pure-Python fallback for the BrainFlow SDK. Seamlessly emulates
SYNTHETIC_BOARDand OpenBCICYTON_BOARD(via direct serial packet parsing) and recreates theDataFiltersignal processing API (filtering, windowing, Welch PSD estimation, bandpower) without requiring native C++ binary dependencies.
🧪 Development
git clone https://github.com/PranavKalkunte/subvocal.git
cd subvocal
pip install -e ".[all,dev]"
pytest # test suite
ruff check src tests # lint
pyright # type check
python benchmarks/eval_runner.py # 50-case heuristic benchmark
Runtime artifacts (traces, trained models) are written to the per-user data directory; override with SUBVOCAL_DATA_DIR / SUBVOCAL_MODELS_DIR.
Contributions are welcome — see CONTRIBUTING.md for the workflow and quality gates, and SECURITY.md for vulnerability reporting.
📄 License
This repository is open-sourced under the MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file subvocal-2.0.0.tar.gz.
File metadata
- Download URL: subvocal-2.0.0.tar.gz
- Upload date:
- Size: 305.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c44725c527f111eceea76447fadeb092b363bf3ea41b1dd3e4b30030e7245d18
|
|
| MD5 |
6c05d2badfcbf23383f1fed90f63d9a2
|
|
| BLAKE2b-256 |
63598fbe64535f8dbf59900a824f1e283a821d731c4080041462c8aae52e71fa
|
File details
Details for the file subvocal-2.0.0-py3-none-any.whl.
File metadata
- Download URL: subvocal-2.0.0-py3-none-any.whl
- Upload date:
- Size: 117.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
769b7529588e40be4909f34da0b43327976255aef63a6bbb9c7a7c6cb7268765
|
|
| MD5 |
9b98ba93db12ab162bcfff51dfaffad2
|
|
| BLAKE2b-256 |
15c1775c89d84593e6d33f28652cefa9e00c3d89eda24a6a69134ce98caa8135
|