Skip to main content

Package to utilize the speech to text API powered by AILabs.tw

Project description

AILabs ASR Python software development kit

PyPI PyPI - License

Development Environment

  • Python 3.9
# install portaudio first if you develop on MAC OS X
brew install portaudio

pip install --global-option='build_ext' --global-option='-I/usr/local/include' --global-option='-L/usr/local/lib' -r requirements_dev.txt

# please check PyAudio site: https://people.csail.mit.edu/hubert/pyaudio/
# if you encouter some issues while installing PyAudio

Installation

pip install ailabs-asr

Samples

# init the streaming client
asr_client = StreamingClient('api-key-applied-from-devconsole')

# start streaming with wav file
asr_client.start_streaming_wav(
  pipeline='asr-zh-en-std',
  file='voice.wav'
  verbose=False, # enable verbose to show detailed recognition result
  on_processing_sentence=on_processing_sentence,
  on_final_sentence=on_final_sentence)

# without file to start streaming with the computer's microphone
asr_client.start_streaming_wav(
  pipeline='asr-zh-en-std',
  on_processing_sentence=on_processing_sentence,
  on_final_sentence=on_final_sentence)

:bulb: start_streaming_wav() method allow users to provide callback function to handle the recognition result see the result format below

:bulb: lookup the available pipelines in the next section

:bulb: see more samples in the sample respository

Support Language(pipeline)

pipeline Info language
asr-zh-en-std Use it when speakers speak Chinese more than English Mandarin and English
asr-zh-tw-std Use it when speakers speak Chinese and Taiwanese. Mandarin and Taiwanese
asr-en-std English English
asr-jp-std Japanese Japanese

Message Format

There are 2 kinds of recognized result:

The Processing Sentence(Segment)

{
  "asr_sentence": "範例句子"
}

The Final Sentence(Complete Sentence)

{
  "asr_final": true,
  "asr_begin_time": 9.314,
  "asr_end_time": 11.314,
  "asr_sentence": "完整的範例句子",
  "asr_confidence": 0.5263263653207881,
  "asr_word_time_stamp": [
    {
      "word": "完整的",
      "begin_time": 9.74021875,
      "end_time": 10.100875
    },
    {
      "word": "範例句子",
      "begin_time": 10.100875,
      "end_time": 10.1664375
    }
  ],
  "text_segmented": "完整的 範例句子"
}

Limitation

Audio Data

:warning: Send audio data with binary frame with following spec:

  • Audio data format
    • 16kHz, mono
    • 16 bits per sample
    • PCM
  • Sample rate per secs: 16K(16000)
  • Sample sizes per sec: 16000(samples) x 1(sec) x 16/8(2 bytes) = 32000 bytes ~= 32 KB(/sec)
  • Each chunk size: 2000 bytes, 1/16 secs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ailabs-asr-0.1.0.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

ailabs_asr-0.1.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file ailabs-asr-0.1.0.tar.gz.

File metadata

  • Download URL: ailabs-asr-0.1.0.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for ailabs-asr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 18dea95b985ec2f0ea0dbf75ebe47170b61423b78506ce051e9a5379429b1ff2
MD5 b0147b22ec590f92d618242fba168171
BLAKE2b-256 4c2aa66cfc171917a52279296452fe52c1ed9d1046951cec53bf523c64fe9095

See more details on using hashes here.

File details

Details for the file ailabs_asr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ailabs_asr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for ailabs_asr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 752522dd307eda55eebbacaa586dad250ebef2b8da751a5529d13ed7ec1ee326
MD5 996976cafa2eee7df7a072de8723a84b
BLAKE2b-256 c372317f09b546658b5056d566381d270f41c2e402eb8d26a3f427e96707fe56

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page