Skip to main content

An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.

Project description

Bangla Speech to Text

BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance. Transcribe speech to text, convert voice to text and perform speech recognition in python with ease, even without internet connection.

Installation

pip install banglaspeech2text

Models

Model Size Best(WER)
'tiny' 100-200 MB N/A
'base' 200-300 MB 46
'small' 2-3 GB 18

NOTE: Bigger model have better accuracy but slower inference speed. Smaller wer is better.You can view the models from here

Pre-requisites

  • Python 3.6+
  • Git
  • Git LFS

Download Git

Windows

  • Download git from here
  • Download git lfs from here

Note: Must check git lfs is marked during installation. If not, you can install git lfs from here

Linux

  • Git
  • Git LFS Ubuntu 16.04
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs

Ubuntu 18.04 and above

sudo apt-get install git-lfs

Mac

brew install git-lfs

Download Git with banglaspeech2text

from banglaspeech2text.utils.install_packages import install_git_windows, install_git_linux

# for windows
install_git_windows()

# for linux
install_git_linux()

Usage

Download a model

from banglaspeech2text import Model, available_models

# Download a model
models = available_models()
print(models) # see the available models by diffrent people and diffrent sizes

model = models[0] # select a model
model.download() # download the model

Use with file

from banglaspeech2text import Model, available_models

# Load a model
models = available_models()
model = models[0] # select a model
model = Model(model) # load the model
model.load()

# Use with file
file_name = 'test.wav'
output = model.recognize(file_name)

print(output) # output will be a dict containing text
print(output['text'])

Use with SpeechRecognition

import speech_recognition as sr
from banglaspeech2text import Model, available_models

# Load a model
models = available_models()
model = models[0] # select a model
model = Model(model) # load the model
model.load()


r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)
    output = model.recognize(audio)

print(output) # output will be a dict containing text
print(output['text'])

Use GPU

import speech_recognition as sr
from banglaspeech2text import Model, available_models

# Load a model
models = available_models()
model = models[0] # select a model
model = Model(model,device="gpu") # load the model
model.load()


r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)
    output = model.recognize(audio)

print(output) # output will be a dict containing text
print(output['text'])

NOTE: This package uses torch as backend. So, you can use any device supported by torch. For more information, see here. But you need to setup torch for gpu first from here.

Some Methods

from banglaspeech2text import Model, available_models

models = available_models()
print(models[0]) # get first model
print(models['base']) # get base models
print(models['whisper_base_bn_sifat']) # get model by name

# set download path
model = Model(model,download_path=r"F:\Code\Python\BanglaSpeech2Text\models") # default is home directory
model.load()

# directly load a model
model = Model('base')
model.load()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

BanglaSpeech2Text-0.0.8.tar.gz (10.3 kB view hashes)

Uploaded Source

Built Distribution

BanglaSpeech2Text-0.0.8-py3-none-any.whl (11.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page