Input a local file or url and this service will transcribe it using Mozilla Deepspeech

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Environment
- Console
License
- OSI Approved :: MIT License
Operating System
Programming Language
- Python :: 3.8

Project description

transcribe-anything

Input a local file or url and this service will transcribe it using Mozilla Deepspeech 0.9.3.

Build Status

Example

Example (cmd):
- transcribe_anything <YOUTUBE_URL> > out_subtitles.txt
- transcribe_anything <LOCAL.MP4/WAV> > out_subtitles.txt

Example (api):

from transcribe_anything.api import bulk_transcribe

urls = ['https://www.youtube.com/watch?v=Erk4_jFDjzQ']
def onresolve(url, sub): print(url, sub)
def onfail(url): print(f'Failed: {url}')
bulk_transcribe(urls, onresolve=onresolve, onfail=onfail)

Quick start

Optional: Create a virtual python package

Works for Ubuntu/MacOS/Win32
mkdir transcribe_anything
cd transcribe_anything
Download and install virtual env:
- curl -X GET https://raw.githubusercontent.com/zackees/make_venv/main/make_venv.py -o make_env.py
- python make_env.py
Enter the environment:
- source activate.sh

The environment is now active and the next step will only install to the local python. If the terminal is closed then to get back into the environment cd transcribe_anything and execute source activate.sh

Required: Install to current python environment

pip install transcribe-anything
- The command transcribe_anything will magically become available.
transcribe_anything <YOUTUBE_URL> > out_subtitles.txt
-or- transcribe_anything <MY_LOCAL.MP4/WAV> > out_subtitles.txt

How does it work?

This program performs fetching using yt-dlp for downloading videos from video services, and then stripping the audio track out.

static_ffmpeg is then called to transcode the audio track into a specific format that DeepSpeech requires.

Once the audio file has been prepared, pydeepspeech is called. This little utility automatically downloads the proper AI models and installs them into the proper path so that deepspeech can be called. It also partitions the input wav file into chunks, split at the parts of silence, in order to make processing go easier (DeepSpeech degrades performance significantly with longer audio clips, so they have to be kept short.)

Tech Stack

Mozilla DeepSpeech: https://github.com/mozilla/DeepSpeech
pydeepspeech: https://github.com/zackees/pydeepspeech
- mic_vad_streaming: https://github.com/hadran9/DeepSpeech-examples/tree/r0.9/mic_vad_streaming
yt-dlp: https://github.com/yt-dlp/yt-dlp
static-ffmpeg
- github: https://github.com/zackees/static_ffmpeg
- pypi: https://pypi.org/project/static-ffmpeg/

Testing

All tests are run by tox, simply go to the project directory root and run it.

Versions

1.2.6: Supports spaces in file names now.
1.2.5:
- Improved handling of YouTube downloads by switching youtube-dl -> yt-dlp

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Environment
- Console
License
- OSI Approved :: MIT License
Operating System
Programming Language
- Python :: 3.8

Release history Release notifications | RSS feed

2.7.36

Apr 29, 2024

2.7.35

Apr 29, 2024

2.7.34

Apr 24, 2024

2.7.33

Feb 24, 2024

2.7.32

Feb 23, 2024

2.7.31

Feb 11, 2024

2.7.30

Feb 11, 2024

2.7.29

Feb 7, 2024

2.7.28

Feb 7, 2024

2.7.27

Feb 6, 2024

2.7.26

Jan 23, 2024

2.7.25

Jan 22, 2024

2.7.24

Jan 22, 2024

2.7.23

Jan 22, 2024

2.7.22

Jan 20, 2024

2.7.19

Jan 19, 2024

2.7.18

Jan 16, 2024

2.7.17

Jan 15, 2024

2.7.16

Jan 15, 2024

2.7.15

Jan 15, 2024

2.7.13

Jan 15, 2024

2.7.12

Jan 14, 2024

2.7.10

Jan 13, 2024

2.7.9

Jan 13, 2024

2.7.8

Jan 13, 2024

2.7.7

Jan 13, 2024

2.7.6

Jan 13, 2024

2.7.5

Jan 13, 2024

2.7.4

Jan 12, 2024

2.7.3

Jan 12, 2024

2.7.2

Jan 12, 2024

2.7.1

Jan 12, 2024

2.7.0

Jan 12, 2024

2.6.0

Jan 11, 2024

2.5.0

Jan 8, 2024

2.4.0

Jan 7, 2024

2.3.9

Jan 4, 2024

2.3.8

Nov 20, 2023

2.3.7

Oct 20, 2023

2.3.6

Jun 23, 2023

2.3.5

Jun 23, 2023

2.3.4

May 8, 2023

2.3.3

May 6, 2023

2.3.2

May 5, 2023

2.3.1

May 5, 2023

2.2.1

May 5, 2023

2.2.0

May 5, 2023

2.1.2

Mar 20, 2023

2.1.1

Feb 14, 2023

2.1.0

Feb 14, 2023

2.0.13

Jan 11, 2023

2.0.12

Dec 21, 2022

2.0.11

Dec 15, 2022

2.0.10

Dec 15, 2022

2.0.9

Dec 9, 2022

2.0.8

Dec 8, 2022

2.0.7

Dec 6, 2022

2.0.6

Dec 3, 2022

2.0.5

Dec 3, 2022

2.0.4

Dec 3, 2022

2.0.2

Dec 2, 2022

2.0.1

Dec 2, 2022

2.0.0

Dec 2, 2022

This version

1.2.6.0

Jun 9, 2022

1.2.5

Mar 21, 2022

1.2.4

May 17, 2021

1.2.3

May 16, 2021

1.2.2

May 16, 2021

1.2.1

May 16, 2021

1.2.0

May 16, 2021

1.1.2

May 16, 2021

1.1.1

May 16, 2021

1.1.0

May 16, 2021

1.0.6

May 12, 2021

1.0.5

May 12, 2021

1.0.3

May 9, 2021

1.0.2

May 9, 2021

1.0.1

May 9, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transcribe-anything-1.2.6.0.tar.gz (8.6 kB view hashes)

Uploaded Jun 9, 2022 Source

Built Distribution

transcribe_anything-1.2.6.0-py2.py3-none-any.whl (8.1 kB view hashes)

Uploaded Jun 9, 2022 Python 2 Python 3

Hashes for transcribe-anything-1.2.6.0.tar.gz

Hashes for transcribe-anything-1.2.6.0.tar.gz
Algorithm	Hash digest
SHA256	`f643885cc12a50502a787476b7e9cc86f117eb05dd6af8a669ca805597d66c59`
MD5	`5a7e6c0739564141e1552d50852a39ef`
BLAKE2b-256	`4111b04663c2406c4ec0fdff2590f5bb42b5ecacea218deb0a01485b1625e159`

Hashes for transcribe_anything-1.2.6.0-py2.py3-none-any.whl

Hashes for transcribe_anything-1.2.6.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`61f48f81448cb7e709e256475182dc9af36db61627eacff735f18072fb902a44`
MD5	`ec85941115af09e005f53ac341a4c57f`
BLAKE2b-256	`1c089e68dec9cce52504e57bb58e997cfb66f5dda38dd24b4cc4da0d0fe40eb3`