Skip to main content

Microsoft Teams voice/video (Conversational Video Interface) plugin for Hermes Agent

Project description

hermes-plugin-teams-voice

Microsoft Teams voice/video (Conversational Video Interface) for Hermes Agent, packaged as a standalone, pip-installable plugin — install it on top of a normal Hermes install, no fork required.

The plugin (name teams_voice) hosts the HMAC-authenticated WebSocket bridge that a media worker dials into, and drives the call: realtime (OpenAI/Azure speech-to-speech) or streaming (STT→agent→TTS), camera/screen vision, the avatar driver cues (expression / visemes / show-to-caller), group-call etiquette, DTMF, bilingual EN/AR, meeting recap/minutes, and SharePoint (OneDrive) file send.

Install on Hermes

Install into the same Python environment as Hermes — it discovers the plugin via the hermes_agent.plugins entry-point and imports it in-process. Target the python that runs hermes (Linux/macOS …/venv/bin/python, Windows …\venv\Scripts\python.exe), or activate that venv first and drop --python.

A — from PyPI (recommended):

uv pip install --python /path/to/hermes/venv/bin/python hermes-plugin-teams-voice
# or, with the Hermes venv activated:  pip install hermes-plugin-teams-voice

B — from GitHub (latest / pre-release):

uv pip install --python /path/to/hermes/venv/bin/python \
  "git+https://github.com/alaamh/hermes-plugin-teams-voice.git"

C — from a local checkout (development):

git clone https://github.com/alaamh/hermes-plugin-teams-voice.git
uv pip install --python /path/to/hermes/venv/bin/python -e ./hermes-plugin-teams-voice

Installing into the wrong environment means Hermes won't see the plugin. Faster audio (optional): add the numpy extra, e.g. hermes-plugin-teams-voice[numpy].

Enable + run

hermes plugins list                            # confirm: teams_voice   (source: entrypoint)
hermes plugins enable teams_voice              # entry-point plugins are opt-in
hermes teams-voice serve --handler realtime    # voice bridge; also: streaming | echo | logging
hermes gateway run                             # (separately) the Teams chat plane + cron

Configure

Config lives in Hermes's own files (this package ships none). Non-secret settings go in config.yaml; secrets go in .env and are referenced with ${VAR}.

~/.hermes/config.yaml — under plugins.entries.teams_voice.config:

plugins:
  enabled:
    - teams_voice                          # entry-point plugins are opt-in
  entries:
    teams_voice:
      config:
        shared_secret: ${TEAMS_VOICE_SHARED_SECRET}   # MUST byte-match the worker's secret
        host: 127.0.0.1
        port: 8443                         # voice WS the worker dials: ws://host:port/voice/msteams/stream
        share_point_site_id: ${TEAMS_SHAREPOINT_SITE_ID}  # optional: attach files/minutes to the chat
        meeting_recap: true                # optional: post minutes at call end
        allowlist: []                      # optional: caller AAD object ids (empty = allow all)
        session_scope: per-call            # per-call | per-thread | per-aad
        # Realtime (speech-to-speech) brain — Azure OpenAI Realtime:
        realtime:
          backend: azure                   # azure | openai
          azure_endpoint: https://<your-azure-resource>.cognitiveservices.azure.com
          azure_deployment: gpt-realtime
          azure_api_version: 2025-04-01-preview
          voice: cedar
          api_key: ${AZURE_FOUNDRY_API_KEY}
          vad_threshold: 0.5
          prefix_padding_ms: 300
          silence_duration_ms: 500

Public OpenAI instead of Azure: set backend: openai, model: gpt-realtime, api_key: ${OPENAI_API_KEY}, and drop the azure_* keys. Streaming (STT→agent→TTS) instead of realtime: omit the realtime: block and run hermes teams-voice serve --handler streaming (needs ffmpeg on PATH).

~/.hermes/.env — the secrets referenced above (plus Teams chat-plane creds if you also run hermes gateway run):

# Voice bridge
TEAMS_VOICE_SHARED_SECRET=<same value as the media worker's shared secret>
AZURE_FOUNDRY_API_KEY=<azure-openai-key>                 # or OPENAI_API_KEY for public OpenAI
TEAMS_SHAREPOINT_SITE_ID=<host>,<siteGuid>,<webGuid>     # optional (needs Graph Sites.ReadWrite.All)

# Teams chat plane (platforms/teams) — only if you run the gateway:
TEAMS_CLIENT_ID=<bot-app-id>
TEAMS_CLIENT_SECRET=<bot-app-secret>
TEAMS_TENANT_ID=<azure-ad-tenant-id>

shared_secret must byte-match the media worker's shared secret or the HMAC handshake fails. Full key reference (every option, streaming mode, DLP/audit, the required Microsoft Graph permissions): hermes_teams_voice/README.md.

Upgrade / uninstall

uv pip install --upgrade hermes-plugin-teams-voice
uv pip uninstall hermes-plugin-teams-voice     # then it disappears from `hermes plugins list`

How it loads

Hermes discovers pip plugins via the hermes_agent.plugins entry-point group. This package exposes:

[project.entry-points."hermes_agent.plugins"]
teams_voice = "hermes_teams_voice"

Hermes imports hermes_teams_voice and calls its register(ctx) — registering the teams-voice CLI, the status tool, and the session hook. Entry-point plugins are opt-in, so teams_voice must be in plugins.enabled (hermes plugins enable does this).

Requirements

  • A working Hermes Agent install (the host; not a PyPI package).
  • Python ≥ 3.10 and aiohttp; ffmpeg on PATH for streaming-mode TTS decode.
  • A media worker that bridges the live Teams call audio/video into this plugin over the HMAC WebSocket (open-source, separate repo).

Relationship to the bundled plugin

This is the same code as the in-tree plugins/teams_voice plugin, repackaged for pip distribution so you don't have to fork Hermes. Install it on vanilla Hermes; don't also keep a bundled teams_voice (same name → the entry-point would shadow it).

  • Voice/CVI works fully on vanilla Hermes.
  • Chat-plane governance + SharePoint file attach depend on the enhanced plugins/platforms/teams adapter; without it the plugin degrades gracefully (e.g. meeting minutes post as text instead of a SharePoint file card).

License

MIT (matches Hermes Agent). Created by Alaaeldin Elhenawy — Dubai, UAE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hermes_plugin_teams_voice-0.1.0.tar.gz (68.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hermes_plugin_teams_voice-0.1.0-py3-none-any.whl (85.9 kB view details)

Uploaded Python 3

File details

Details for the file hermes_plugin_teams_voice-0.1.0.tar.gz.

File metadata

File hashes

Hashes for hermes_plugin_teams_voice-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8082166c10305bc51056f4600e4ca67eb80d02b1ea7a741a6ceeb57cd275dca6
MD5 55227175b8a761a2578eea756424db10
BLAKE2b-256 034e1669888f8b4b1f8e71c05b40618d0813466e4c358e9ac0c8e9ad47810905

See more details on using hashes here.

File details

Details for the file hermes_plugin_teams_voice-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for hermes_plugin_teams_voice-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 30eeb2a4aac549489c459495a06a1dc58a43136e540ebecef98363d863e8958d
MD5 d2bae6847c1660c3cbd93109b9135edd
BLAKE2b-256 e685b5d4f87d2477ada6b73635d7b3ca414f4818922e13875fd0d6af62a08539

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page