Skip to main content

A CLI tool to download audio from a YouTube video, transcribe it, and refine the transcription using AI.

Project description

ytdebunk

Overview

ytdebunk is a command-line tool designed to:

  • Download audio from YouTube videos.
  • Transcribe the audio content.
  • Optionally enhance the transcription using the Gemini API.
  • Optionally detect logical faults in the transctiption using the Gemini API.

This tool is particularly useful for analyzing transcriptions to identify logical fallacies and incorrect claims made by YouTubers.

Installation

For avoiding conflicts better create a virtual environment and start working on it:

python3.11 -m venv .venv
source .venv/bin/activate

Now, you can install from PyPI using,

pip install ytdebunk

Alternatively, for latest updated please try installing directly from Github using:

pip install git+https://github.com/hissain/youtuber-debunked.git

Usage

The ytdebunk.py script provides a command-line interface (CLI) with several options.

Arguments

  • video_url (str) – URL of the YouTube video to download audio from.

Options

Option Description
-e, --enhance (bool) Enhance the transcription using the Gemini API. (Default: False)
-d, --detect (bool) Detect logical faults in the transcription using Gemini API. (Default: False)
-v, --verbose (bool) Increase output verbosity.
-t, --token (str) API token for the Gemini API (Required if --enhance or --detectis enabled).
-st, --start_time (float) Start time of the audio clip in seconds
-et, --end_time (float) End time of the audio clip in seconds

Example Usage

ytdebunk "https://www.youtube.com/watch?v=example" -e -d -v -t YOUR_GEMINI_API_TOKEN
export GEMINI_API_TOKEN="your_api_key"
ytdebunk "https://www.youtube.com/watch?v=example" -e -d -v #when Gemini API key is in environment

See an example notebook Example Notebook file for details usage.

Environment Variables

If preferred, you can set the Gemini API token as an environment variable instead of passing it as a CLI argument:

export GEMINI_API_TOKEN="your_api_key"

Detailed Process

  1. Download Audio

    • Uses the download_audio function from ytdebunk.downloader to download audio from the given YouTube URL.
  2. Transcribe Audio

    • Uses the transcribe_audio function from ytdebunk.transcriber to generate a text transcription.
  3. Enhance Transcription (Optional)

    • If --enhance is enabled, the script uses enhance_transcription from ytdebunk.refiner to refine the transcription using the Gemini API.
    • The API token must be provided via --token or as an environment variable.
  4. Detect Logical Faults (Optional)

    • If --detect is enabled, the script uses detect_logical_faults from ytdebunk.philosopher to detect logical fults, fallacies, bias, irony and so on in the transcription using the Gemini API.
    • The API token must be provided via --token or as an environment variable.
  5. Save Transcription

    • The final transcription and logical faults (raw or enhanced) are saved to the ./download folder.

Error Handling

  • If --enhance or --detect are enabled but no Gemini API token is provided, the script prints an error message and exits.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contribution and Contact

You can fork this project and submit pull request in the project. Please contact to the author at hissain.khan@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ytdebunk-1.0.3.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ytdebunk-1.0.3-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file ytdebunk-1.0.3.tar.gz.

File metadata

  • Download URL: ytdebunk-1.0.3.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for ytdebunk-1.0.3.tar.gz
Algorithm Hash digest
SHA256 f0cf06423a2be4bf70261f7c22db3e97dd58d99eccd212ef55a64382a539afd0
MD5 0bc28e9b5812f62cf8d4e101d8076ee1
BLAKE2b-256 e03d196d412597dd03c42940e07da83e03cc312e754115e860fa03242fcc85b7

See more details on using hashes here.

File details

Details for the file ytdebunk-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: ytdebunk-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for ytdebunk-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6cdefc216b55c1388241741644382f92181688686c51ac3837b52267c5bdb758
MD5 d3610014a43059184950d1216aad4980
BLAKE2b-256 a7c0feec11c6f385f030a99366ec7fae35efaa896a441bef16b5b53e92ac366e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page