Skip to main content

Package to utilize the speech to text API powered by AILabs.tw

Project description

AILabs ASR Python software development kit

PyPI PyPI - License

Development Environment

  • Python 3.9
# install portaudio first if you develop on MAC OS X
brew install portaudio

pip install --global-option='build_ext' --global-option='-I/usr/local/include' --global-option='-L/usr/local/lib' -r requirements_dev.txt

# please check PyAudio site: https://people.csail.mit.edu/hubert/pyaudio/
# if you encouter some issues while installing PyAudio

Installation

pip install ailabs-asr

Samples

# init the streaming client
asr_client = StreamingClient('api-key-applied-from-devconsole')

# start streaming with wav file
asr_client.start_streaming_wav(
  pipeline='asr-zh-en-std',
  file='voice.wav'
  verbose=False, # enable verbose to show detailed recognition result
  on_processing_sentence=on_processing_sentence,
  on_final_sentence=on_final_sentence)

# without file to start streaming with the computer's microphone
asr_client.start_streaming_wav(
  pipeline='asr-zh-en-std',
  on_processing_sentence=on_processing_sentence,
  on_final_sentence=on_final_sentence)

:bulb: start_streaming_wav() method allow users to provide callback function to handle the recognition result see the result format below

:bulb: lookup the available pipelines in the next section

:bulb: see more samples in the sample respository

Support Language(pipeline)

pipeline Info language
asr-zh-en-std Use it when speakers speak Chinese more than English Mandarin and English
asr-zh-tw-std Use it when speakers speak Chinese and Taiwanese. Mandarin and Taiwanese
asr-en-std English English
asr-jp-std Japanese Japanese

Message Format

There are 2 kinds of recognized result:

The Processing Sentence(Segment)

{
  "asr_sentence": "範例句子"
}

The Final Sentence(Complete Sentence)

{
  "asr_final": true,
  "asr_begin_time": 9.314,
  "asr_end_time": 11.314,
  "asr_sentence": "完整的範例句子",
  "asr_confidence": 0.5263263653207881,
  "asr_word_time_stamp": [
    {
      "word": "完整的",
      "begin_time": 9.74021875,
      "end_time": 10.100875
    },
    {
      "word": "範例句子",
      "begin_time": 10.100875,
      "end_time": 10.1664375
    }
  ],
  "text_segmented": "完整的 範例句子"
}

Limitation

Audio Data

:warning: Send audio data with binary frame with following spec:

  • Audio data format
    • 16kHz, mono
    • 16 bits per sample
    • PCM
  • Sample rate per secs: 16K(16000)
  • Sample sizes per sec: 16000(samples) x 1(sec) x 16/8(2 bytes) = 32000 bytes ~= 32 KB(/sec)
  • Each chunk size: 2000 bytes, 1/16 secs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ailabs-asr-0.0.11.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

ailabs_asr-0.0.11-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file ailabs-asr-0.0.11.tar.gz.

File metadata

  • Download URL: ailabs-asr-0.0.11.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ailabs-asr-0.0.11.tar.gz
Algorithm Hash digest
SHA256 87e6855ac4127d84e5c7c3b52463273b064e638db367b7c4af25b82352ec6120
MD5 92454c55973ea13b123ab7db9d48e8f7
BLAKE2b-256 b32e4e8b876b83c86a3cc3325f6c080f61a97c573d348b7326172081127c3564

See more details on using hashes here.

File details

Details for the file ailabs_asr-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: ailabs_asr-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ailabs_asr-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 2beb90c7a4c977ac731fc0a98aecdfcb88e687839607ec8fea5a2d16a01e0f5d
MD5 b2e622a52f810b975c4e5937301f8c7e
BLAKE2b-256 f93f4203b3c2742c01ecc584fec7c3c9e2993a77a944bd4252b9b35872ed8c0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page