Skip to main content

Package to utilize the speech to text API powered by AILabs.tw

Project description

AILabs ASR Python software development kit

PyPI PyPI - License

Development Environment

  • Python 3.9
# install portaudio first if you develop on MAC OS X
brew install portaudio

pip install --global-option='build_ext' --global-option='-I/usr/local/include' --global-option='-L/usr/local/lib' -r requirements_dev.txt

# please check PyAudio site: https://people.csail.mit.edu/hubert/pyaudio/
# if you encouter some issues while installing PyAudio

Installation

pip install ailabs-asr

Samples

# init the streaming client
asr_client = StreamingClient('api-key-applied-from-devconsole')

# start streaming with wav file
asr_client.start_streaming_wav(
  pipeline='asr-zh-en-std',
  file='voice.wav'
  verbose=False, # enable verbose to show detailed recognition result
  on_processing_sentence=on_processing_sentence,
  on_final_sentence=on_final_sentence)

# without file to start streaming with the computer's microphone
asr_client.start_streaming_wav(
  pipeline='asr-zh-en-std',
  on_processing_sentence=on_processing_sentence,
  on_final_sentence=on_final_sentence)

:bulb: start_streaming_wav() method allow users to provide callback function to handle the recognition result see the result format below

:bulb: lookup the available pipelines in the next section

:bulb: see more samples in the sample respository

Support Language(pipeline)

pipeline Info language
asr-zh-en-std Use it when speakers speak Chinese more than English Mandarin and English
asr-zh-tw-std Use it when speakers speak Chinese and Taiwanese. Mandarin and Taiwanese
asr-en-std English English
asr-jp-std Japanese Japanese

Message Format

There are 2 kinds of recognized result:

The Processing Sentence(Segment)

{
  "asr_sentence": "範例句子"
}

The Final Sentence(Complete Sentence)

{
  "asr_final": true,
  "asr_begin_time": 9.314,
  "asr_end_time": 11.314,
  "asr_sentence": "完整的範例句子",
  "asr_confidence": 0.5263263653207881,
  "asr_word_time_stamp": [
    {
      "word": "完整的",
      "begin_time": 9.74021875,
      "end_time": 10.100875
    },
    {
      "word": "範例句子",
      "begin_time": 10.100875,
      "end_time": 10.1664375
    }
  ],
  "text_segmented": "完整的 範例句子"
}

Limitation

Audio Data

:warning: Send audio data with binary frame with following spec:

  • Audio data format
    • 16kHz, mono
    • 16 bits per sample
    • PCM
  • Sample rate per secs: 16K(16000)
  • Sample sizes per sec: 16000(samples) x 1(sec) x 16/8(2 bytes) = 32000 bytes ~= 32 KB(/sec)
  • Each chunk size: 2000 bytes, 1/16 secs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ailabs-asr-0.0.10.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

ailabs_asr-0.0.10-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file ailabs-asr-0.0.10.tar.gz.

File metadata

  • Download URL: ailabs-asr-0.0.10.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ailabs-asr-0.0.10.tar.gz
Algorithm Hash digest
SHA256 ae9ca20679a4ce9b8ca9a101101ceaff4d3a7f9e4c2f0fd5018e5f60a955c775
MD5 70ad02911bac871710d8e31db2b9c32b
BLAKE2b-256 405e4e6f65aad47eedf672df476a89add2805b21cb0b8395371108cc74d554fd

See more details on using hashes here.

File details

Details for the file ailabs_asr-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: ailabs_asr-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ailabs_asr-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 5694018b4441b5ea09ce465211df7eab80803db8e9cd29ef3b83efb5a5be48e0
MD5 cbffcd1922d14a053345d7b2a6a22581
BLAKE2b-256 daa3bdb4b3f50f8c25e877583a36bec2063d26431874cbfd8ce4e0954ee69099

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page