Skip to main content

Python SDK for EnderTuring speech toolkit

Project description

Ender Turing

Ender Turing is a solution for voice content understanding, analytics and business insights. Check enderturing.com for details.

Installation

$ pip install enderturing

For using streaming speech recognition functions, you'll also need FFmpeg installed.

Ubuntu:

$ sudo apt install ffmpeg

MacOS homebrew:

$ brew install ffmpeg

For other OS, please follow FFmpeg installation guides.

Quick Start

import asyncio
from enderturing import Config, EnderTuring, RecognitionResultFormat

# create configuration
config = Config.from_url("https://admin%40local.enderturing.com:your_password@enterturing.yourcompany.com")
et = EnderTuring(config)

# access sessions list
sessions = et.sessions.list()
print(sessions)

# get recognizer for one of configured languages
recognizer = et.get_speech_recognizer(language='en')

async def run_stream_recog(f, r, result_format):
    async with r.stream_recognize(f, result_format=result_format) as rec:
        text = await rec.read()
    return text

# recognize specified file
loop = asyncio.get_event_loop()
task = loop.create_task(run_stream_recog("my_audio.mp3", recognizer, result_format=RecognitionResultFormat.text))
loop.run_until_complete(task)
print(task.result())

Usage

SDK contains two major parts:

  • Using Ender Turing REST API
  • Speech recognition

Using Ender Turing API

All API calls are accessible via an instance or EnderTuring. API methods are grouped, and each group is a property of EnderTuring. Examples:

from enderturing import Config, EnderTuring, RecognitionResultFormat

et = EnderTuring(Config.from_env())

# access sessions list
sessions = et.sessions.list()

# working with ASR
et.asr.get_instances(active_only=True)

# accessing raw json
et.raw.create_event(caller_id='1234', event_data={"type": "hold"})

Access Configuration

To access API, you need to know an authentication key (login), authentication secret (password), and installation URL (e.g. https://enderturing.yourcompany.com/)

There are multiple ways to pass config options:

  • from environmental variables (Config.from_env()).
  • creating Config with parameters (e.g. Config(auth_key="my_login", auth_secret="my_secret""))
  • using Enter Turing configuration URL (Config.from_url())

Creating Speech Recognizer

There two options to create a speech recognizer:

If you have access to API configured:

recognizer = et.get_speech_recognizer(language='en')

If you know URL and sample rate of desired ASR instance:

from enderturing import AsrConfig, SpeechRecognizer

config = AsrConfig(url="wss://enderturing", sample_rate=8000)
recognizer = SpeechRecognizer(config)

Recognizing a File

SpeechRecognizer.recognize_file method returns an async text stream. Depending on parameters, each line contains either a text of utterance or serialized JSON.

If you are only interested in results after recognition is complete, you can use the read() method. E.g.

async with recognizer.recognize_file("my_audio.wav", result_format=RecognitionResultFormat.text) as rec:
    text = await rec.read()

If you prefer getting words and phrases as soon as they are recognized - you can use the readline() method instead. E.g.

async with recognizer.recognize_file(src, result_format=RecognitionResultFormat.jsonl) as rec:
    line = await rec.readline()
    while line:
        # Now line contains a json string, you can save it or do something else with it
        line = await rec.readline()

Working With Multichannel Audio

If an audio file has more than one channel - by default system will recognize each channel and return a transcript for each channel. To change the default behavior - you can use channels parameter of SpeechRecognizer.recognize_file. Please check method documentation for details.

Sometimes an audio is stored as a file per channel, e.g., contact center call generates two files: one for a client and one for a support agent. But for analysis, it's preferable to see transcripts of the files merged as a dialog. In this scenario, you can use recognizer.recognize_joined_file([audio1, audio2]).

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enderturing-0.2.0.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

enderturing-0.2.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file enderturing-0.2.0.tar.gz.

File metadata

  • Download URL: enderturing-0.2.0.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.0 Darwin/20.3.0

File hashes

Hashes for enderturing-0.2.0.tar.gz
Algorithm Hash digest
SHA256 87b3cc8a3cf23eb10c004e02d7bd2c24d65dfe7a81f0a8acb6ec80078d17c819
MD5 06a96ed59aabe405ce1c004b24df626c
BLAKE2b-256 dc56a09b51572a8fb94971d9e20f4a0d9c6925d2bf355d1f0c99d8385d1cb27a

See more details on using hashes here.

File details

Details for the file enderturing-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: enderturing-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.0 Darwin/20.3.0

File hashes

Hashes for enderturing-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d9585b2deb2ac6ab357adb5114a08ee74847d4f14ba56ca70a80e094859b415
MD5 32abaf233eeba1016805034b882c38c3
BLAKE2b-256 0428c57c34d056a9e58c906a6c4dc194d1ef68aa64756630b63cbd4df376f432

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page