Skip to main content

monkeyplug is a little script to censor profanity in audio files.

Project description

monkeyplug

Latest Version VOSK Docker Images Whisper Docker Images

monkeyplug is a little script to censor profanity in audio files (intended for podcasts, but YMMV) in a few simple steps:

  1. The user provides a local audio file (or a URL pointing to an audio file which is downloaded)
  2. Either Whisper (GitHub) or the Vosk-API is used to recognize speech in the audio file (or a pre-generated transcript can be loaded)
  3. Each recognized word is checked against a list of profanity or other words you'd like muted (supports text or JSON format)
  4. ffmpeg is used to create a cleaned audio file, muting or "bleeping" the objectional words
  5. Optionally, the transcript can be saved for reuse in future processing runs

You can then use your favorite media player to play the cleaned audio file.

If provided a video file for input, monkeyplug will attempt to process the audio stream from the file and remultiplex it, copying the original video stream.

monkeyplug is part of a family of projects with similar goals:

Installation

Using pip, to install the latest release from PyPI:

python3 -m pip install -U monkeyplug

Or to install directly from GitHub:

python3 -m pip install -U 'git+https://github.com/mmguero/monkeyplug'

Prerequisites

monkeyplug requires:

To install FFmpeg, use your operating system's package manager or install binaries from ffmpeg.org. The Python dependencies will be installed automatically if you are using pip to install monkeyplug, except for vosk or openai-whisper; as monkeyplug can work with both speech recognition engines, there is not a hard installation requirement for either until runtime.

usage

usage: monkeyplug <arguments>

options:
  -h, --help            show this help message and exit
  -v [true|false], --verbose [true|false]
                        Verbose/debug output
  -m <string>, --mode <string>
                        Speech recognition engine (whisper|vosk) (default: whisper)
  -i <string>, --input <string>
                        Input file (or URL)
  -o <string>, --output <string>
                        Output file
  -w <profanity file>, --swears <profanity file>
                        text or JSON file containing profanity (default: "swears.txt")
  --output-json <string>
                        Output file to store transcript JSON
  --input-transcript <string>
                        Load existing transcript JSON instead of performing speech recognition
  --save-transcript     Automatically save transcript JSON alongside output audio file
  --force-retranscribe  Force new transcription even if transcript file exists (overrides automatic reuse)                        
  -a <str>, --audio-params <str>
                        Audio parameters for ffmpeg (default depends on output audio codec)
  -c <int>, --channels <int>
                        Audio output channels (default: 2)
  -s <int>, --sample-rate <int>
                        Audio output sample rate (default: 48000)
  -r <str>, --bitrate <str>
                        Audio output bitrate (default: 256K)
  -q <int>, --vorbis-qscale <int>
                        qscale for libvorbis output (default: 5)
  -f <string>, --format <string>
                        Output file format (default: inferred from extension of --output, or "MATCH")
  --pad-milliseconds <int>
                        Milliseconds to pad on either side of muted segments (default: 0)
  --pad-milliseconds-pre <int>
                        Milliseconds to pad before muted segments (default: 0)
  --pad-milliseconds-post <int>
                        Milliseconds to pad after muted segments (default: 0)
  -b [true|false], --beep [true|false]
                        Beep instead of silence
  -z <int>, --beep-hertz <int>
                        Beep frequency hertz (default: 1000)
  --beep-mix-normalize [true|false]
                        Normalize mix of audio and beeps (default: False)
  --beep-audio-weight <int>
                        Mix weight for non-beeped audio (default: 1)
  --beep-sine-weight <int>
                        Mix weight for beep (default: 1)
  --beep-dropout-transition <int>
                        Dropout transition for beep (default: 0)
  --force [true|false]  Process file despite existence of embedded tag

VOSK Options:
  --vosk-model-dir <string>
                        VOSK model directory (default: ~/.cache/vosk)
  --vosk-read-frames-chunk <int>
                        WAV frame chunk (default: 8000)

Whisper Options:
  --whisper-model-dir <string>
                        Whisper model directory (~/.cache/whisper)
  --whisper-model-name <string>
                        Whisper model name (base.en)
  --torch-threads <int>
                        Number of threads used by torch for CPU inference (0)

Docker

Alternately, a Dockerfile is provided to allow you to run monkeyplug in Docker. You can pull one of the following images:

  • VOSK
    • oci.guero.org/monkeyplug:vosk-small
    • oci.guero.org/monkeyplug:vosk-large
  • Whisper
    • oci.guero.org/monkeyplug:whisper-tiny.en
    • oci.guero.org/monkeyplug:whisper-tiny
    • oci.guero.org/monkeyplug:whisper-base.en
    • oci.guero.org/monkeyplug:whisper-base
    • oci.guero.org/monkeyplug:whisper-small.en
    • oci.guero.org/monkeyplug:whisper-small
    • oci.guero.org/monkeyplug:whisper-medium.en
    • oci.guero.org/monkeyplug:whisper-medium
    • oci.guero.org/monkeyplug:whisper-large-v1
    • oci.guero.org/monkeyplug:whisper-large-v2
    • oci.guero.org/monkeyplug:whisper-large-v3
    • oci.guero.org/monkeyplug:whisper-large

then run monkeyplug-docker.sh inside the directory where your audio files are located.

Transcript Workflow

monkeyplug supports saving and reusing transcripts to improve workflow efficiency:

Save Transcript for Later Reuse

# Generate transcript once and save it
monkeyplug -i input.mp3 -o output.mp3 --save-transcript

# This creates output.mp3 and output_transcript.json

Automatic Transcript Reuse

# Second run: Automatically detects and reuses transcript (22x faster!)
monkeyplug -i input.mp3 -o output.mp3 --save-transcript
# Finds output_transcript.json and reuses it automatically

# Force new transcription when needed
monkeyplug -i input.mp3 -o output.mp3 --save-transcript --force-retranscribe

Manual Transcript Loading

# Explicitly specify transcript to load
monkeyplug -i input.mp3 -o output_strict.mp3 --input-transcript output_transcript.json -w strict_swears.txt

Contributing

If you'd like to help improve monkeyplug, pull requests will be welcomed!

Authors

  • Seth Grover - Initial work - mmguero

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monkeyplug-2.1.9.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

monkeyplug-2.1.9-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file monkeyplug-2.1.9.tar.gz.

File metadata

  • Download URL: monkeyplug-2.1.9.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for monkeyplug-2.1.9.tar.gz
Algorithm Hash digest
SHA256 b4316373a8478b4d3b3404683fc25714d73d919105cc7081f7f19f0c06de69e3
MD5 50879ceaea31c1433154c7d6c51242f0
BLAKE2b-256 d078b1ed3e99b24e41173b41d3eedea6a62e4ec0ae0b0b256ca9f62c78aa0859

See more details on using hashes here.

File details

Details for the file monkeyplug-2.1.9-py3-none-any.whl.

File metadata

  • Download URL: monkeyplug-2.1.9-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for monkeyplug-2.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 0b1254a17ebb2067cd4e7f3257a1ca297fea0593774b307587db131b13035b72
MD5 0cd6712933743e2ac5799c60446d8c88
BLAKE2b-256 8dbc0125b5ed11894615496fc4eb0beb398f3959adb9f9626f6583a9dfeac4bc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page