SDK for dolphinvoice
Project description
DolphinVoice Python SDK
DolphinVoice SDK is used for speech recognition and synthesis. This SDK provides three main modules:
- Real-time Speech Recognition (ASR)
- Audio File Transcription (FileAsr)
- Text to Speech (TTS)
Directory
Documentation
Find more detailed documentation and guides about the DolphinVoice SDK in the following resources:
For technical support or any questions, please contact our developer support team: voice.contact@dolphin-ai.jp
Installation
Install the DolphinVoice Python SDK using pip: You can install this SDK directly from pip.
pip install dolphinvoice
Usage
Real-time Speech Recognition
from dolphinvoice.speech_rec.callbacks import SpeechTranscriberCallback
from dolphinvoice import speech_rec
import time
class Callback(SpeechTranscriberCallback):
def started(self, message):
print('TranscriptionStarted: %s' % message)
def result_changed(self, message):
print('TranscriptionResultChanged: %s' % message)
def sentence_begin(self, message):
print('SentenceBegin: %s' % message)
def sentence_end(self, message):
print('SentenceEnd: %s' % message)
def completed(self, message):
print('TranscriptionCompleted: %s' % message)
def task_failed(self, message):
print('TaskFailed: %s' % message)
def warning_info(self, message):
print('Warning: %s' % message)
def channel_closed(self):
print('TranslationChannelClosed')
audio_path = 'demo.mp3'
client = speech_rec.SpeechClient(app_id='YOUR_APP_ID', app_secret='YOUR_APP_SECRET')
with client.create_transcriber(Callback()) as transcriber:
transcriber.set_parameter({
"lang_type": "en-US",
"format": "mp3",
"sample_rate": 16000,
})
transcriber.start()
with open(audio_path, 'rb') as f:
audio = f.read(7680)
while audio:
transcriber.send(audio)
time.sleep(0.24)
audio = f.read(7680)
transcriber.stop()
Audio File Transcription
from dolphinvoice import speech_rec
client = speech_rec.SpeechClient(app_id='YOUR_APP_ID', app_secret='YOUR_APP_SECRET')
asrfile = client.create_asrfile()
audio = 'demo.mp3'
data = {
"lang_type": "en-US",
"format": "mp3",
"sample_rate": 16000
}
result = asrfile.transcribe_file(audio, data)
print(result)
Text to Speech
from dolphinvoice.speech_syn.callbacks import SpeechSynthesizerCallback
from dolphinvoice import speech_syn
class MyCallback(SpeechSynthesizerCallback):
def __init__(self, name):
self._name = name
self._fout = open(name, 'wb')
def binary_data_received(self, raw):
self._fout.write(raw)
def on_message(self, message):
print('Received : %s' % message)
def started(self, message):
print('MyCallback.OnSynthesizerStarted: %s' % message)
def get_Timestamp(self,message):
print('MyCallback.OnSynthesizerGetTimestamp: %s' % message)
def get_Duration(self, message):
print('MyCallback.OnSynthesizerGetDuration: %s' % message)
def completed(self, message):
print('MyCallback.OnSynthesizerCompleted: %s' % message)
self._fout.close()
def channel_closed(self):
print('MyCallback.OnSynthesizerChannelClosed')
audio_name = 'syAudio.mp3'
client = speech_syn.SpeechClient(app_id='YOUR_APP_ID', app_secret='YOUR_APP_SECRET')
callback = MyCallback(audio_name)
with client.create_synthesizer(callback) as synthesizer:
synthesizer.set_parameter({
"text": "The weather is nice, let's go for a walk.",
"lang_type": "en-US",
"format": "mp3"
})
synthesizer.start()
synthesizer.wait_completed()
API Reference
Real-time Speech Recognition
The real-time speech recognition module is for processing real-time audio streams.
Methods
create_transcriber(callback: SpeechSynthesizerCallback)- Registers event handlers for recognition eventsset_parameter(params: Json)- Specifies parameters- For the complete API documentation, refer to DolphinVoice API Documentation
start()- Starts a new recognition sessionsend(stream: Bytes)- Sends audio stream to the recognition servicestop()- Stops the current recognition session and releases resources
Events
TranscriptionStarted- Triggered when recognition session startsSentenceBegin- Triggered when a new sentence is detectedTranscriptionResultChanged- Triggered when intermediate results are updatedSentenceEnd- Triggered when a sentence is completedTranscriptionCompleted- Triggered when the entire recognition session is completedWarning- Triggered when a non-fatal warning occurs
Audio File Transcription
The audio file transcription module is for processing pre-recorded audio files.
Methods
transcribe_file(audio: String, params: Json)- Uploads and transcribes the audio file- For the complete API documentation, refer to DolphinVoice API Documentation
Text to Speech
The text-to-speech synthesis module is used to convert text into natural speech.
Methods
create_synthesizer(callback: SpeechSynthesizerCallback)- Registers event handlers for synthesis eventsset_parameter(params: Json)- Specifies parameters- For the complete API documentation, refer to DolphinVoice API Documentation
start()- Starts a new synthesis session
Events
OnSynthesizerStarted- Triggered when synthesis process startsOnSynthesizerGetDuration- Provides the total duration of the synthesized audioOnSynthesizerGetTimestamp- Provides timestamp information for the synthesized textOnSynthesizerCompleted- Triggered when synthesis process is completed
Keywords
DolphinVoice DolphinAI ASR TTS Text-to-Speech Speech-to-Text Speech-Recognition Speech-Synthesis
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dolphinvoice-1.0.0.tar.gz.
File metadata
- Download URL: dolphinvoice-1.0.0.tar.gz
- Upload date:
- Size: 25.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6024d7b28aa8dd73cdc6078c19439f7303841cfb96919e2a96d763e56c74b867
|
|
| MD5 |
9b547ef2d17f67d1ddeee79e1436f761
|
|
| BLAKE2b-256 |
407373bc3ed5ab83adf5815a17e0f8a290c825eb31dea39d26b151910e4c2e0e
|
File details
Details for the file dolphinvoice-1.0.0-py3-none-any.whl.
File metadata
- Download URL: dolphinvoice-1.0.0-py3-none-any.whl
- Upload date:
- Size: 38.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7876eab32887e0538ddafd838f87405cee71d24767582bfb7d67742e5b50da24
|
|
| MD5 |
e2d75015855c585595a599815b8ffcf0
|
|
| BLAKE2b-256 |
42cca7c81d6c4beaef179d721bf558fc5eca4d2e952f33aa7fa9c24df3418a1c
|