Skip to main content

FrogBase simplifies the download-transcribe-embed-index workflow for multi-media content. It does so by linking content from various platforms with speech-to-text models, image & text encoders and embedding stores.

Project description

🐸 FrogBase

Create navigable knowledge from multi-media content

FrogBase (previously whisper-ui) simplifies the download-transcribe-embed-index workflow for multi-media content. It does so by linking content from various platforms (yt_dlp) with speech-to-text models (OpenAI's Whisper), image & text encoders (SentenceTransformers), and embedding stores (hnswlib).

from frogbase import FrogBase
fb = FrogBase()
fb.demo()
fb.search("What is the name of the squeaky frog?")

Full Documentation (WIP).

FrogBase also comes with a ready-to-use UI for non-technical users!

https://user-images.githubusercontent.com/6735526/216852681-53b6c3db-3e74-4c86-806f-6f6774a9003a.mp4

PyPI Status Python Version License Discord

Features

FrogBase currently provides functionality to:

  • Download media files from a wide range of platforms (YouTube, TikTok, Vimeo, etc.) using yt_dlp
  • Transcribe audio streams for downloaded & local files using OpenAI's Whisper
  • Embed transcribed text from corresponding video segments using Sentence Transformers
  • Index & search the embedded content using hnswlib

FrogBase also includes a Streamlit UI to provide a simple GUI for the above functionality enabling a locally hosted, interactive experience.

Quickstart

Software Developers

This section is for software developers who want to use FrogBase as a python package.

  1. Install ffmpeg and FrogBase

    sudo apt install ffmpeg
    pip install frogbase
    
  2. Import FrogBase and use it as follows -

    from frogbase import FrogBase
    
    fb = FrogBase()
    
    sources = [
       "https://www.youtube.com/watch?v=HBxn56l9WcU",
       "https://www.youtube.com/@hayabhay"
    ]
    
    fb.add(sources)\
       .transcribe()\
       .embed()\
       .index()
    
    fb.search("What is the name of the squeaky frog?")
    

Non-technical Users

This section is for non-technical users who want to use FrogBase primarily through the accompanying Streamlit UI.

  1. Download the latest release of FrogBase from here and unzip it. Or, you can also clone the repository console git clone https://github.com/hayabhay/frogbase.git

  2. Install FrogBase dependencies manually and run the UI.

    Note: This also requires ffmpeg to be installed on your system. You can install it using sudo apt install ffmpeg on Ubuntu.

    1. Using pip

      pip install frogbase streamlit
      streamlit run ui/01_🏠_Home.py
      

[Coming soon] Instructions, environment for installation using Docker & Anaconda

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

frogbase-2.0.0a2.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

frogbase-2.0.0a2-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file frogbase-2.0.0a2.tar.gz.

File metadata

  • Download URL: frogbase-2.0.0a2.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.4 Linux/5.19.0-46-generic

File hashes

Hashes for frogbase-2.0.0a2.tar.gz
Algorithm Hash digest
SHA256 f135d358d77e2768c6d20891b5e61e3082a1845654632d55a243ff8c698cb324
MD5 e469d9c0f6292e0bf7b6c76adb5600b2
BLAKE2b-256 aceade3d3f5ec5ee444a9c5dc593b1cf0bc76bf6bc73c6b71494a6c50b8828cb

See more details on using hashes here.

File details

Details for the file frogbase-2.0.0a2-py3-none-any.whl.

File metadata

  • Download URL: frogbase-2.0.0a2-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.4 Linux/5.19.0-46-generic

File hashes

Hashes for frogbase-2.0.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 fac3081458f9df0a0f40602df1cb26c4602e3d00df6801f5617ae910c34f8490
MD5 40a096d202549ba9e280749bb44acb3f
BLAKE2b-256 d31a600f742dbc0830def760104e07762105761de1b53a6a4c3c84245d9f35e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page