Parrots, Automatic Speech Recognition(**ASR**), Text-To-Speech(**TTS**) toolkit
Project description
🇨🇳中文 | 🌐English | 📖文档/Docs | 🤖模型/Models
Parrots: ASR and TTS toolkit
Introduction
Parrots, Automatic Speech Recognition(ASR), Text-To-Speech(TTS) toolkit, support Chinese, English, Japanese, etc.
parrots实现了语音识别和语音合成模型一键调用,开箱即用,支持中英文。
Features
- ASR:基于
distilwhisper
实现的中文语音识别(ASR)模型,支持中、英等多种语言 - TTS:基于
GPT-SoVITS
训练的语音合成(TTS)模型,支持中、英、日等多种语言
Install
pip install torch # or conda install pytorch
pip install -r requirements.txt
pip install parrots
or
pip install torch # or conda install pytorch
git clone https://github.com/shibing624/parrots.git
cd parrots
python setup.py install
Demo
- Official Demo: https://www.mulanai.com/product/asr/
- HuggingFace Demo: https://huggingface.co/spaces/shibing624/parrots
run example: examples/tts_gradio_demo.py to see the demo:
python examples/tts_gradio_demo.py
Usage
ASR(Speech Recognition)
example: examples/demo_asr.py
import os
import sys
sys.path.append('..')
from parrots import SpeechRecognition
pwd_path = os.path.abspath(os.path.dirname(__file__))
if __name__ == '__main__':
m = SpeechRecognition()
r = m.recognize_speech_from_file(os.path.join(pwd_path, 'tushuguan.wav'))
print('[提示] 语音识别结果:', r)
output:
{'text': '北京图书馆'}
TTS(Speech Synthesis)
example: examples/demo_tts.py
import sys
sys.path.append('..')
from parrots import TextToSpeech
m = TextToSpeech(
speaker_model_path="shibing624/parrots-gpt-sovits-speaker-maimai",
speaker_name="MaiMai",
device="cpu",
half=False
)
m.predict(
text="你好,欢迎来北京。welcome to the city.",
text_language="auto",
output_path="output_audio.wav"
)
output:
Save audio to output_audio.wav
命令行模式(CLI)
支持通过命令行方式执行ARS和TTS任务,代码:cli.py
> parrots -h
NAME
parrots
SYNOPSIS
parrots COMMAND
COMMANDS
COMMAND is one of the following:
asr
Entry point of asr, recognize speech from file
tts
Entry point of tts, generate speech audio from text
run:
pip install parrots -U
# asr example
parrots asr -h
parrots asr examples/tushuguan.wav
# tts example
parrots tts -h
parrots tts "你好,欢迎来北京。welcome to the city." output_audio.wav
asr
、tts
是二级命令,asr是语音识别,tts是语音合成,默认使用的模型是中文模型- 各二级命令使用方法见
parrots asr -h
- 上面示例中
examples/tushuguan.wav
是asr
方法的audio_file_path
参数,输入的音频文件(required)
Contact
- Issue(建议):
- 邮件我:xuming: xuming624@qq.com
- 微信我:加我微信号:xuming624, 进Python-NLP交流群,备注:姓名-公司名-NLP
Citation
如果你在研究中使用了parrots,请按如下格式引用:
@misc{parrots,
title={parrots: ASR and TTS Tool},
author={Ming Xu},
year={2024},
howpublished={\url{https://github.com/shibing624/parrots}},
}
License
授权协议为 The Apache License 2.0,可免费用做商业用途。请在产品说明中附加parrots的链接和授权协议。
Contribute
项目代码还很粗糙,如果大家对代码有所改进,欢迎提交回本项目,在提交之前,注意以下两点:
- 在
tests
添加相应的单元测试 - 使用
python -m pytest
来运行所有单元测试,确保所有单测都是通过的
之后即可提交PR。
Reference
ASR(Speech Recognition)
TTS(Speech Synthesis)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parrots-1.0.3.tar.gz
(5.3 MB
view details)
File details
Details for the file parrots-1.0.3.tar.gz
.
File metadata
- Download URL: parrots-1.0.3.tar.gz
- Upload date:
- Size: 5.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fbdf113506027800e99b35412c2591031bfd4de445362c384c235a6603d1bcc8 |
|
MD5 | 78a610885e9b5db06a038502297e23ec |
|
BLAKE2b-256 | 9bbaaa4da0c4682ff1ced733d5ce90ea77f1583b9f271f6859026d8ccc3d382b |