Skip to main content

Python SDK for EnderTuring speech toolkit

Project description

Ender Turing

Ender Turing is a solution for voice content understanding, analytics and business insights. Check enderturing.com for details.

Installation

$ pip install enderturing

For using streaming speech recognition functions, you'll also need FFmpeg installed.

Ubuntu:

$ sudo apt install ffmpeg

MacOS homebrew:

$ brew install ffmpeg

For other OS, please follow FFmpeg installation guides.

Quick Start

import asyncio
from enderturing import Config, EnderTuring, RecognitionResultFormat

# create configuration
config = Config.from_url("https://admin%40local.enderturing.com:your_password@enterturing.yourcompany.com")
et = EnderTuring(config)

# access sessions list
sessions = et.sessions.list()
print(sessions)

# get recognizer for one of configured languages
recognizer = et.get_speech_recognizer(language='en')

async def run_stream_recog(f, r, result_format):
    async with r.stream_recognize(f, result_format=result_format) as rec:
        text = await rec.read()
    return text

# recognize specified file
loop = asyncio.get_event_loop()
task = loop.create_task(run_stream_recog("my_audio.mp3", recognizer, result_format=RecognitionResultFormat.text))
loop.run_until_complete(task)
print(task.result())

Usage

SDK contains two major parts:

  • Using Ender Turing REST API
  • Speech recognition

Using Ender Turing API

All API calls are accessible via an instance or EnderTuring. API methods are grouped, and each group is a property of EnderTuring. Examples:

from enderturing import Config, EnderTuring, RecognitionResultFormat

et = EnderTuring(Config.from_env())

# access sessions list
sessions = et.sessions.list()

# working with ASR
et.asr.get_instances(active_only=True)

# accessing raw json
et.raw.create_event(caller_id='1234', event_data={"type": "hold"})

Access Configuration

To access API, you need to know an authentication key (login), authentication secret (password), and installation URL (e.g. https://enderturing.yourcompany.com/)

There are multiple ways to pass config options:

  • from environmental variables (Config.from_env()).
  • creating Config with parameters (e.g. Config(auth_key="my_login", auth_secret="my_secret""))
  • using Enter Turing configuration URL (Config.from_url())

Creating Speech Recognizer

There two options to create a speech recognizer:

If you have access to API configured:

recognizer = et.get_speech_recognizer(language='en')

If you know URL and sample rate of desired ASR instance:

from enderturing import AsrConfig, SpeechRecognizer

config = AsrConfig(url="wss://enderturing", sample_rate=8000)
recognizer = SpeechRecognizer(config)

Recognizing a File

SpeechRecognizer.recognize_file method returns an async text stream. Depending on parameters, each line contains either a text of utterance or serialized JSON.

If you are only interested in results after recognition is complete, you can use the read() method. E.g.

async with recognizer.recognize_file("my_audio.wav", result_format=RecognitionResultFormat.text) as rec:
    text = await rec.read()

If you prefer getting words and phrases as soon as they are recognized - you can use the readline() method instead. E.g.

async with recognizer.recognize_file(src, result_format=RecognitionResultFormat.jsonl) as rec:
    line = await rec.readline()
    while line:
        # Now line contains a json string, you can save it or do something else with it
        line = await rec.readline()

Working With Multichannel Audio

If an audio file has more than one channel - by default system will recognize each channel and return a transcript for each channel. To change the default behavior - you can use channels parameter of SpeechRecognizer.recognize_file. Please check method documentation for details.

Sometimes an audio is stored as a file per channel, e.g., contact center call generates two files: one for a client and one for a support agent. But for analysis, it's preferable to see transcripts of the files merged as a dialog. In this scenario, you can use recognizer.recognize_joined_file([audio1, audio2]).

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enderturing-0.4.0.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

enderturing-0.4.0-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file enderturing-0.4.0.tar.gz.

File metadata

  • Download URL: enderturing-0.4.0.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.5 Darwin/20.1.0

File hashes

Hashes for enderturing-0.4.0.tar.gz
Algorithm Hash digest
SHA256 8d3d5c0469e1d78f1bd65f46536c1985126383a9a4c71eeb733f7f3463c06a9c
MD5 eb85f3c2c7e884f1953ce5b2e768c0f0
BLAKE2b-256 fd71b3d88b3a2d8369ad0754ae245ce0fca2c98c922a78a7deab62c132d49bac

See more details on using hashes here.

File details

Details for the file enderturing-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: enderturing-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.5 Darwin/20.1.0

File hashes

Hashes for enderturing-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 56f2a3d32b6571f74270908518a79b181c6d5606b06aeced9adb4c01a368f625
MD5 ce5e722c2e566efa09b7806a43a65018
BLAKE2b-256 aae4feac45470d4a171ec343c545bf3230933f1750831d2f050cfc94fadd084d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page