TMH Speech package
Project description
TMH Speech
TMH Speech is a library that gives access to open source models for transcription.
Getting started
To start the project you first need to install tmh and pyannote, since we are using newer packages.
pip install tmh
pip install https://github.com/pyannote/pyannote-audio/archive/develop.zip
Example usage
Transcription
from tmh.transcribe import transcribe_from_audio_path
file_path = "./sv.wav"
transcription = "Nu prövar vi att spela in ljud på svenska sex laxar i en laxask de finns en stor banan"
print("creating transcription")
asr_transcription = transcribe_from_audio_path(file_path)
print("output")
print(asr_transcription)
print("the transcription is", transcription)
Transcribe with VAD
from tmh.transcribe_with_vad import transcribe_from_audio_path_split_on_speech
file_path = "./sv.wav"
print("creating transcription")
asr_transcription_with_vad = transcribe_from_audio_path_split_on_speech(file_path)
print("transcription")
print(asr_transcription_with_vad)
Overlap detection
from tmh.overalp import overlap_detection
file_path = "./sv.wav"
overlap = overlap_detection(audio_path)
print(overlap)
Language classification
from tmh.transcribe import classify_language
file_path = "./sv.wav"
transcription = "Nu prövar vi att spela in ljud på svenska sex laxar i en laxask de finns en stor banan"
print("classifying language")
language = classify_language(file_path)
print("the language is", language)
Classify emotion
from tmh.transcribe import classify_emotion
file_path = "./sv.wav"
print("classifying emotion")
language = classify_emotion(file_path)
print("the emotion is", language)
Speaker embeddings
https://huggingface.co/speechbrain/spkrec-xvect-voxceleb
Extract speaker embedding
from tmh.transcribe import extract_speaker_embedding
file_path = "./sv.wav"
print("extracting speaker embedding")
embeddings = extract_speaker_embedding(file_path)
print("the speaker embedding is", embeddings)
Voice activity detection
from tmh.vad import extract_silences
file_path = "./sv.wav"
print("extracting silences")
embeddings = extract_silences(file_path)
print("the silences are", embeddings)
Speech Generation
Tacotron 2
Make sure you install these packages before running tacotron 2
pip install numpy scipy librosa unidecode inflect librosa
apt-get update
apt-get install -y libsndfile1
Text generation
You can use the text generation api to generate text based on any pretrained model from huggingface.
Example Swedish
from tmh.text import generate_text
output = generate_text(model='birgermoell/swedish-gpt', prompt="AI har möjligheten att", min_length=150)
print(output)
Example GPT-j
from tmh.text import generate_text
output = generate_text(model='EleutherAI/gpt-neo-2.7B', prompt="EleutherAI has", min_length=150)
print(output)
Codex
Generate code and save to file. To use
from tmh.code import generate_from_prompt, write_to_file
response = generate_from_prompt('''
A pytorch neural network model for MNIST
'''
)
write_to_file(response, "generated.py")
Build instructions
Change the version number
python3 -m build
twine upload --skip-existing dist/*
Github
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tmh-0.0.41.tar.gz
(8.3 kB
view hashes)
Built Distribution
tmh-0.0.41-py3-none-any.whl
(10.3 kB
view hashes)