Skip to main content

Microsoft Teams voice/video (Conversational Video Interface) plugin for Hermes Agent

Project description

hermes-plugin-teams-voice

PyPI Python CI Docs License: MIT

Microsoft Teams voice/video (Conversational Video Interface) for Hermes Agent, packaged as a standalone, pip-installable plugin — install it on top of a normal Hermes install, no fork required.

The plugin (name teams_voice) hosts the HMAC-authenticated WebSocket bridge that a media worker dials into, and drives the call: realtime (OpenAI/Azure speech-to-speech) or streaming (STT→agent→TTS), camera/screen vision, the avatar driver cues (expression / visemes / show-to-caller), group-call etiquette, DTMF, bilingual EN/AR, meeting recap/minutes, and SharePoint (OneDrive) file send.

📖 Full setup, configuration & reference: https://docs.komaa.com/

Install on Hermes

Install into the same Python environment as Hermes — it discovers the plugin via the hermes_agent.plugins entry-point and imports it in-process. Target the python that runs hermes (Linux/macOS …/venv/bin/python, Windows …\venv\Scripts\python.exe), or activate that venv first and drop --python.

A — from PyPI (recommended):

uv pip install --python /path/to/hermes/venv/bin/python hermes-plugin-teams-voice
# or, with the Hermes venv activated:  pip install hermes-plugin-teams-voice

B — from GitHub (latest / pre-release):

uv pip install --python /path/to/hermes/venv/bin/python \
  "git+https://github.com/komaa-com/hermes-plugin-teams-voice.git"

C — from a local checkout (development):

git clone https://github.com/komaa-com/hermes-plugin-teams-voice.git
uv pip install --python /path/to/hermes/venv/bin/python -e ./hermes-plugin-teams-voice

Installing into the wrong environment means Hermes won't see the plugin. Faster audio (optional): add the numpy extra, e.g. hermes-plugin-teams-voice[numpy].

Enable + run

hermes plugins list                            # confirm: teams_voice   (source: entrypoint)
hermes plugins enable teams_voice              # entry-point plugins are opt-in
hermes teams-voice serve --handler realtime    # voice bridge; also: streaming | echo | logging
hermes gateway run                             # (separately) the Teams chat plane + cron

Configure

Config lives in Hermes's own files (this package ships none). Non-secret settings go in config.yaml; secrets go in .env and are referenced with ${VAR}.

~/.hermes/config.yaml — under plugins.entries.teams_voice.config:

plugins:
  enabled:
    - teams_voice                          # entry-point plugins are opt-in
  entries:
    teams_voice:
      config:
        shared_secret: ${TEAMS_VOICE_SHARED_SECRET}   # MUST byte-match the worker's secret
        host: 127.0.0.1
        port: 8443                         # voice WS the worker dials: ws://host:port/voice/msteams/stream
        share_point_site_id: ${TEAMS_SHAREPOINT_SITE_ID}  # optional: attach files/minutes to the chat
        meeting_recap: true                # optional: post minutes at call end
        allowlist: []                      # optional: caller AAD object ids (empty = allow all)
        session_scope: per-call            # per-call | per-thread | per-aad
        # Realtime (speech-to-speech) brain — Azure OpenAI Realtime:
        realtime:
          backend: azure                   # azure | openai
          azure_endpoint: https://<your-azure-resource>.cognitiveservices.azure.com
          azure_deployment: gpt-realtime
          azure_api_version: 2025-04-01-preview
          voice: cedar
          api_key: ${AZURE_FOUNDRY_API_KEY}
          vad_threshold: 0.5
          prefix_padding_ms: 300
          silence_duration_ms: 500

Public OpenAI instead of Azure: set backend: openai, model: gpt-realtime, api_key: ${OPENAI_API_KEY}, and drop the azure_* keys. Streaming (STT→agent→TTS) instead of realtime: omit the realtime: block and run hermes teams-voice serve --handler streaming (needs ffmpeg on PATH).

~/.hermes/.env — the secrets referenced above (plus Teams chat-plane creds if you also run hermes gateway run):

# Voice bridge
TEAMS_VOICE_SHARED_SECRET=<same value as the media worker's shared secret>
AZURE_FOUNDRY_API_KEY=<azure-openai-key>                 # or OPENAI_API_KEY for public OpenAI
TEAMS_SHAREPOINT_SITE_ID=<host>,<siteGuid>,<webGuid>     # optional (needs Graph Sites.ReadWrite.All)

# Teams chat plane (platforms/teams) — only if you run the gateway:
TEAMS_CLIENT_ID=<bot-app-id>
TEAMS_CLIENT_SECRET=<bot-app-secret>
TEAMS_TENANT_ID=<azure-ad-tenant-id>

shared_secret must byte-match the media worker's shared secret or the HMAC handshake fails. Full key reference (every option, streaming mode, DLP/audit, the required Microsoft Graph permissions): hermes_teams_voice/README.md.

Upgrade / uninstall

uv pip install --upgrade hermes-plugin-teams-voice
uv pip uninstall hermes-plugin-teams-voice     # then it disappears from `hermes plugins list`

How it loads

Hermes discovers pip plugins via the hermes_agent.plugins entry-point group. This package exposes:

[project.entry-points."hermes_agent.plugins"]
teams_voice = "hermes_teams_voice"

Hermes imports hermes_teams_voice and calls its register(ctx) — registering the teams-voice CLI, the status tool, and the session hook. Entry-point plugins are opt-in, so teams_voice must be in plugins.enabled (hermes plugins enable does this).

Requirements

  • A working Hermes Agent install (the host; not a PyPI package).
  • Python ≥ 3.10 and aiohttp; ffmpeg on PATH for streaming-mode TTS decode.
  • A media worker that bridges the live Teams call audio/video into this plugin over the HMAC WebSocket (open-source, separate repo).

Relationship to the bundled plugin

This is the same code as the in-tree plugins/teams_voice plugin, repackaged for pip distribution so you don't have to fork Hermes. Install it on vanilla Hermes; don't also keep a bundled teams_voice (same name → the entry-point would shadow it).

  • Voice/CVI works fully on vanilla Hermes.
  • Chat-plane governance + SharePoint file attach depend on the enhanced plugins/platforms/teams adapter; without it the plugin degrades gracefully (e.g. meeting minutes post as text instead of a SharePoint file card).

License

MIT — see LICENSE. Maintained by Komaa.com · docs at https://docs.komaa.com/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hermes_plugin_teams_voice-0.1.1.tar.gz (68.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hermes_plugin_teams_voice-0.1.1-py3-none-any.whl (85.4 kB view details)

Uploaded Python 3

File details

Details for the file hermes_plugin_teams_voice-0.1.1.tar.gz.

File metadata

File hashes

Hashes for hermes_plugin_teams_voice-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f936b35f1156153c092260da470b524f4ba33b01a03ed0e34a0baa6eda3ae948
MD5 388ae1b5b348cb440c793c538bae418f
BLAKE2b-256 c75a97732ac6d35a8f45b52c0120354746ac23c6df8f576c9fb06f1072c9d1bb

See more details on using hashes here.

File details

Details for the file hermes_plugin_teams_voice-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for hermes_plugin_teams_voice-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 92df53c48295bd1a6902135ae84186af41445c6593ecd1c328fc183ce61595e6
MD5 914f9f967f2c55ce98d8dc9028c54661
BLAKE2b-256 3b0281be09cbfc768104cb3c073b8cdd294333da3984471d965cb0d769112e25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page