Transcribe your .wav .mp4 .mp3 .flac files to text or record your own audio!

These details have not been verified by PyPI

Development Status
- 5 - Production/Stable
Environment
- Console
License
- Public Domain
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3

Project description

Audio-Transcriber - A2A | AG-UI | MCP

PyPI - Version MCP Server PyPI - Downloads GitHub Repo stars GitHub forks GitHub contributors PyPI - License GitHub

GitHub last commit (by committer) GitHub pull requests GitHub closed pull requests GitHub issues

GitHub top language GitHub language count GitHub repo size GitHub repo file count (file type) PyPI - Wheel PyPI - Implementation

Version: 0.6.8

Overview

Transcribe your .wav .mp4 .mp3 .flac files to text or record your own audio!

This repository is actively maintained - Contributions are welcome!

Contribution Opportunities:

Support new models

Wrapped around OpenAI Whisper

MCP

MCP Tools

Function Name	Description	Tag(s)
`transcribe_audio`	Transcribes audio from a provided file or by recording from the microphone.	`audio_processing`

A2A Agent

Architecture Summary

---
config:
  layout: dagre
---
flowchart TB
 subgraph subGraph0["Agent Capabilities"]
        C["Agent"]
        B["A2A Server - Uvicorn/FastAPI"]
        D["MCP Tools"]
        F["Agent Skills"]
  end
    C --> D & F
    A["User Query"] --> B
    B --> C
    D --> E["Platform API"]

     C:::agent
     B:::server
     A:::server
    classDef server fill:#f9f,stroke:#333
    classDef agent fill:#bbf,stroke:#333,stroke-width:2px
    style B stroke:#000000,fill:#FFD600
    style D stroke:#000000,fill:#BBDEFB
    style F fill:#BBDEFB
    style A fill:#C8E6C9
    style subGraph0 fill:#FFF9C4

Component Interaction Diagram

sequenceDiagram
    participant User
    participant Server as A2A Server
    participant Agent as Agent
    participant Skill as Agent Skills
    participant MCP as MCP Tools

    User->>Server: Send Query
    Server->>Agent: Invoke Agent
    Agent->>Skill: Analyze Skills Available
    Skill->>Agent: Provide Guidance on Next Steps
    Agent->>MCP: Invoke Tool
    MCP-->>Agent: Tool Response Returned
    Agent-->>Agent: Return Results Summarized
    Agent-->>Server: Final Response
    Server-->>User: Output

Usage

CLI

Short Flag	Long Flag	Description
-h	--help	See Usage
-b	--bitrate	Bitrate to use during recording
-c	--channels	Number of channels to use during recording
-d	--directory	Directory to save recording
-e	--export	Export txt, srt, and vtt files
-f	--file	File to transcribe
-l	--language	Language to transcribe
-m	--model	Model to use: <tiny, base, small, medium, large>
-n	--name	Name of recording
-r	--record	Specify number of seconds to record to record from microphone

audio-transcriber --file '~/Downloads/Federal_Reserve.mp4' --model 'large'

audio-transcriber --record 60 --directory '~/Downloads/' --name 'my_recording.wav' --model 'tiny'

MCP CLI

Short Flag	Long Flag	Description
-h	--help	Display help information
-t	--transport	Transport method: 'stdio', 'http', or 'sse' [legacy] (default: stdio)
-s	--host	Host address for HTTP transport (default: 0.0.0.0)
-p	--port	Port number for HTTP transport (default: 8000)
	--auth-type	Authentication type: 'none', 'static', 'jwt', 'oauth-proxy', 'oidc-proxy', 'remote-oauth' (default: none)
	--token-jwks-uri	JWKS URI for JWT verification
	--token-issuer	Issuer for JWT verification
	--token-audience	Audience for JWT verification
	--oauth-upstream-auth-endpoint	Upstream authorization endpoint for OAuth Proxy
	--oauth-upstream-token-endpoint	Upstream token endpoint for OAuth Proxy
	--oauth-upstream-client-id	Upstream client ID for OAuth Proxy
	--oauth-upstream-client-secret	Upstream client secret for OAuth Proxy
	--oauth-base-url	Base URL for OAuth Proxy
	--oidc-config-url	OIDC configuration URL
	--oidc-client-id	OIDC client ID
	--oidc-client-secret	OIDC client secret
	--oidc-base-url	Base URL for OIDC Proxy
	--remote-auth-servers	Comma-separated list of authorization servers for Remote OAuth
	--remote-base-url	Base URL for Remote OAuth
	--allowed-client-redirect-uris	Comma-separated list of allowed client redirect URIs
	--eunomia-type	Eunomia authorization type: 'none', 'embedded', 'remote' (default: none)
	--eunomia-policy-file	Policy file for embedded Eunomia (default: mcp_policies.json)
	--eunomia-remote-url	URL for remote Eunomia server

Using as an MCP Server

The MCP Server can be run in two modes: stdio (for local testing) or http (for networked access). To start the server, use the following commands:

Run in stdio mode (default):

audio-transcriber-mcp

Run in HTTP mode:

audio-transcriber-mcp --transport "http"  --host "0.0.0.0"  --port "8000"

Model Information

Courtesy of and Credits to OpenAI: Whisper.ai

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~32x
base	74 M	`base.en`	`base`	~1 GB	~16x
small	244 M	`small.en`	`small`	~2 GB	~6x
medium	769 M	`medium.en`	`medium`	~5 GB	~2x
large	1550 M	N/A	`large`	~10 GB	1x

Deploy MCP Server as a Service

The ServiceNow MCP server can be deployed using Docker, with configurable authentication, middleware, and Eunomia authorization.

Using Docker Run

docker pull knucklessg1/audio-transcriber:latest

docker run -d \
  --name audio-transcriber-mcp \
  -p 8004:8004 \
  -e HOST=0.0.0.0 \
  -e PORT=8004 \
  -e TRANSPORT=http \
  -e AUTH_TYPE=none \
  -e EUNOMIA_TYPE=none \
  knucklessg1/audio-transcriber:latest

For advanced authentication (e.g., JWT, OAuth Proxy, OIDC Proxy, Remote OAuth) or Eunomia, add the relevant environment variables:

docker run -d \
  --name audio-transcriber-mcp \
  -p 8004:8004 \
  -e HOST=0.0.0.0 \
  -e PORT=8004 \
  -e TRANSPORT=http \
  -e AUTH_TYPE=oidc-proxy \
  -e OIDC_CONFIG_URL=https://provider.com/.well-known/openid-configuration \
  -e OIDC_CLIENT_ID=your-client-id \
  -e OIDC_CLIENT_SECRET=your-client-secret \
  -e OIDC_BASE_URL=https://your-server.com \
  -e ALLOWED_CLIENT_REDIRECT_URIS=http://localhost:*,https://*.example.com/* \
  -e EUNOMIA_TYPE=embedded \
  -e EUNOMIA_POLICY_FILE=/app/mcp_policies.json \
  knucklessg1/audio-transcriber:latest

Using Docker Compose

Create a docker-compose.yml file:

services:
  audio-transcriber-mcp:
    image: knucklessg1/audio-transcriber:latest
    environment:
      - HOST=0.0.0.0
      - PORT=8004
      - TRANSPORT=http
      - AUTH_TYPE=none
      - EUNOMIA_TYPE=none
    ports:
      - 8004:8004

For advanced setups with authentication and Eunomia:

services:
  audio-transcriber-mcp:
    image: knucklessg1/audio-transcriber:latest
    environment:
      - HOST=0.0.0.0
      - PORT=8004
      - TRANSPORT=http
      - AUTH_TYPE=oidc-proxy
      - OIDC_CONFIG_URL=https://provider.com/.well-known/openid-configuration
      - OIDC_CLIENT_ID=your-client-id
      - OIDC_CLIENT_SECRET=your-client-secret
      - OIDC_BASE_URL=https://your-server.com
      - ALLOWED_CLIENT_REDIRECT_URIS=http://localhost:*,https://*.example.com/*
      - EUNOMIA_TYPE=embedded
      - EUNOMIA_POLICY_FILE=/app/mcp_policies.json
    ports:
      - 8004:8004
    volumes:
      - ./mcp_policies.json:/app/mcp_policies.json

Run the service:

docker-compose up -d

Configure `mcp.json` for AI Integration

Configure mcp.json

{
  "mcpServers": {
    "audio_transcriber": {
      "command": "uv",
      "args": [
        "run",
        "--with",
        "audio-transcriber",
        "audio-transcriber-mcp"
      ],
      "env": {
        "WHISPER_MODEL": "medium",            // Optional
        "TRANSCRIBE_DIRECTORY": "~/Downloads" // Optional
      },
      "timeout": 200000
    }
  }
}

A2A CLI

Endpoints

Web UI: http://localhost:8000/ (if enabled)
A2A: http://localhost:8000/a2a (Discovery: /a2a/.well-known/agent.json)
AG-UI: http://localhost:8000/ag-ui (POST)

Short Flag	Long Flag	Description
-h	--help	Display help information
	--host	Host to bind the server to (default: 0.0.0.0)
	--port	Port to bind the server to (default: 9000)
	--reload	Enable auto-reload
	--provider	LLM Provider: 'openai', 'anthropic', 'google', 'huggingface'
	--model-id	LLM Model ID (default: qwen3:4b)
	--base-url	LLM Base URL (for OpenAI compatible providers)
	--api-key	LLM API Key

Install Python Package

python -m pip install audio-transcriber

uv pip install --upgrade audio-transcriber

Ubuntu Dependencies

sudo apt-get update
sudo apt-get install libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0 ffmpeg gcc -y

Repository Owners

GitHub followers GitHub User's stars

Project details

These details have not been verified by PyPI

Development Status
- 5 - Production/Stable
Environment
- Console
License
- Public Domain
Operating System
- POSIX :: Linux
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.6.8

Feb 14, 2026

0.6.7

Feb 12, 2026

0.6.6

Feb 12, 2026

0.6.5

Feb 12, 2026

0.6.4

Feb 11, 2026

0.6.2

Feb 11, 2026

0.6.1

Feb 10, 2026

0.5.78

Feb 10, 2026

0.5.77

Feb 9, 2026

0.5.76

Feb 7, 2026

0.5.75

Feb 1, 2026

0.5.74

Jan 31, 2026

0.5.73

Jan 29, 2026

0.5.72

Jan 29, 2026

0.5.71

Jan 29, 2026

0.5.70

Jan 28, 2026

0.5.69

Jan 28, 2026

0.5.68

Jan 26, 2026

0.5.67

Jan 25, 2026

0.5.66

Jan 24, 2026

0.5.65

Jan 21, 2026

0.5.64

Jan 19, 2026

0.5.63

Jan 19, 2026

0.5.62

Oct 29, 2025

0.5.61

Oct 29, 2025

0.5.58

Oct 29, 2025

0.5.57

Oct 21, 2025

0.5.56

Oct 21, 2025

0.5.55

Oct 17, 2025

0.5.54

Oct 17, 2025

0.5.53

Oct 17, 2025

0.5.52

Oct 6, 2025

0.5.51

Oct 5, 2025

0.5.50

Oct 3, 2025

0.5.49

Oct 3, 2025

0.5.48

Oct 1, 2025

0.5.47

Oct 1, 2025

0.5.46

Sep 30, 2025

0.5.45

Sep 10, 2025

0.5.44

Sep 10, 2025

0.5.43

Sep 9, 2025

0.5.42

Sep 9, 2025

0.5.41

Sep 9, 2025

0.5.40

Sep 9, 2025

0.5.37

Feb 9, 2024

0.5.36

Feb 8, 2024

0.5.35

Feb 7, 2024

0.5.34

Feb 6, 2024

0.5.33

Feb 5, 2024

0.5.32

Feb 4, 2024

0.5.31

Feb 3, 2024

0.5.30

Feb 2, 2024

0.5.29

Feb 1, 2024

0.5.28

Jan 31, 2024

0.5.27

Jan 30, 2024

0.5.26

Jan 29, 2024

0.5.25

Jan 28, 2024

0.5.24

Jan 27, 2024

0.5.23

Jan 26, 2024

0.5.22

Jan 25, 2024

0.5.21

Jan 24, 2024

0.5.20

Jan 23, 2024

0.5.19

Jan 22, 2024

0.5.18

Jan 21, 2024

0.5.17

Jan 20, 2024

0.5.16

Jan 19, 2024

0.5.15

Jan 18, 2024

0.5.14

Jan 17, 2024

0.5.13

Jan 16, 2024

0.5.12

Jan 15, 2024

0.5.11

Jan 15, 2024

0.5.10

Jan 14, 2024

0.5.9

Jan 13, 2024

0.5.8

Jan 12, 2024

0.5.7

Jan 11, 2024

0.5.6

Jan 10, 2024

0.5.5

Jan 9, 2024

0.5.4

Jan 8, 2024

0.5.3

Jan 7, 2024

0.5.2

Jan 6, 2024

0.5.1

Jan 5, 2024

0.5.0

Dec 18, 2023

0.4.0

Oct 15, 2023

0.3.0

Dec 30, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio_transcriber-0.6.8.tar.gz (32.3 kB view details)

Uploaded Feb 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audio_transcriber-0.6.8-py3-none-any.whl (31.2 kB view details)

Uploaded Feb 14, 2026 Python 3

File details

Details for the file audio_transcriber-0.6.8.tar.gz.

File metadata

Download URL: audio_transcriber-0.6.8.tar.gz
Upload date: Feb 14, 2026
Size: 32.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for audio_transcriber-0.6.8.tar.gz
Algorithm	Hash digest
SHA256	`c3f5837eda9d7d287bea7c35af876171c5b2c89917520a33720b6f1cb7e27788`
MD5	`fb9d373399ab2c6f921af680a1654fcc`
BLAKE2b-256	`bc13ae15548d1af8d0bc8df3aee7b5654518ec296df92b6be2b8c49de7dd1929`

See more details on using hashes here.

File details

Details for the file audio_transcriber-0.6.8-py3-none-any.whl.

File metadata

Download URL: audio_transcriber-0.6.8-py3-none-any.whl
Upload date: Feb 14, 2026
Size: 31.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for audio_transcriber-0.6.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2b731eecad65529affef6a7d1fab700457412f3f6d0a73428317f7ad8717d277`
MD5	`f9d59efea8048f639316a582ab001601`
BLAKE2b-256	`ad7915fdbab90436ff1373f1445ee06a293e1c6f9d36e22f23e6c0a8c42b9417`

See more details on using hashes here.

audio-transcriber 0.6.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Audio-Transcriber - A2A | AG-UI | MCP

Overview

MCP

MCP Tools

A2A Agent

Architecture Summary

Component Interaction Diagram

Usage

CLI

MCP CLI

Using as an MCP Server

Run in stdio mode (default):

Run in HTTP mode:

Model Information

Deploy MCP Server as a Service

Using Docker Run

Using Docker Compose

Configure mcp.json for AI Integration

A2A CLI

Endpoints

Install Python Package

Ubuntu Dependencies

Repository Owners

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Configure `mcp.json` for AI Integration