TMH Speech package
Project description
TMH Speech
TMH Speech is a library that gives access to open source models for transcription.
Read the docs
https://tmh-docs.readthedocs.io/en/latest/docs.html#getting-started
Getting started
To start the project you first need to install tmh and pyannote, since we are using newer packages.
pip install tmh
pip install https://github.com/pyannote/pyannote-audio/archive/develop.zip
Example usage
Transcription
from tmh.transcribe import transcribe_from_audio_path
file_path = "./sv.wav"
transcription = "Nu prövar vi att spela in ljud på svenska sex laxar i en laxask de finns en stor banan"
print("creating transcription")
asr_transcription = transcribe_from_audio_path(file_path)
print("output")
print(asr_transcription)
print("the transcription is", transcription)
Transcribe with VAD
from tmh.transcribe_with_vad import transcribe_from_audio_path_split_on_speech
file_path = "./sv.wav"
print("creating transcription")
asr_transcription_with_vad = transcribe_from_audio_path_split_on_speech(file_path)
print("transcription")
print(asr_transcription_with_vad)
Overlap detection
from tmh.overalp import overlap_detection
file_path = "./sv.wav"
overlap = overlap_detection(audio_path)
print(overlap)
Language classification
from tmh.transcribe import classify_language
file_path = "./sv.wav"
transcription = "Nu prövar vi att spela in ljud på svenska sex laxar i en laxask de finns en stor banan"
print("classifying language")
language = classify_language(file_path)
print("the language is", language)
Classify emotion
from tmh.transcribe import classify_emotion
file_path = "./sv.wav"
print("classifying emotion")
language = classify_emotion(file_path)
print("the emotion is", language)
Speaker embeddings
https://huggingface.co/speechbrain/spkrec-xvect-voxceleb
Extract speaker embedding
from tmh.transcribe import extract_speaker_embedding
file_path = "./sv.wav"
print("extracting speaker embedding")
embeddings = extract_speaker_embedding(file_path)
print("the speaker embedding is", embeddings)
Voice activity detection
from tmh.vad import extract_silences
file_path = "./sv.wav"
print("extracting silences")
embeddings = extract_silences(file_path)
print("the silences are", embeddings)
Speech Generation
Tacotron 2
Make sure you install these packages before running tacotron 2
pip install numpy scipy librosa unidecode inflect librosa
apt-get update
apt-get install -y libsndfile1
Text generation
You can use the text generation api to generate text based on any pretrained model from huggingface.
Example Swedish
from tmh.text import generate_text
output = generate_text(model='birgermoell/swedish-gpt', prompt="AI har möjligheten att", min_length=150)
print(output)
Example GPT-j
from tmh.text import generate_text
output = generate_text(model='EleutherAI/gpt-neo-2.7B', prompt="EleutherAI has", min_length=150)
print(output)
Codex
Generate code and save to file. To use
from tmh.code import generate_from_prompt, write_to_file
response = generate_from_prompt('''
A pytorch neural network model for MNIST
'''
)
write_to_file(response, "generated.py")
Build instructions
Change the version number
python3 -m build
twine upload --skip-existing dist/*
Read the docs
https://tmh-docs.readthedocs.io/en/latest/docs.html#getting-started
Github
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tmh-0.0.43.tar.gz
(8.4 kB
view hashes)
Built Distribution
tmh-0.0.43-py3-none-any.whl
(10.4 kB
view hashes)