Skip to main content

Python SDK for EnderTuring speech toolkit

Project description

Ender Turing

Ender Turing is a solution for voice content understanding, analytics and business insights. Check enderturing.com for details.

Installation

$ pip install enderturing

For using streaming speech recognition functions, you'll also need FFmpeg installed.

Ubuntu:

$ sudo apt install ffmpeg

MacOS homebrew:

$ brew install ffmpeg

For other OS, please follow FFmpeg installation guides.

Quick Start

import asyncio
from enderturing import Config, EnderTuring, RecognitionResultFormat

# create configuration
config = Config.from_url("https://admin%40local.enderturing.com:your_password@enterturing.yourcompany.com")
et = EnderTuring(config)

# access sessions list
sessions = et.sessions.list()
print(sessions)

# get recognizer for one of configured languages
recognizer = et.get_speech_recognizer(language='en')

async def run_stream_recog(f, r, result_format):
    async with r.stream_recognize(f, result_format=result_format) as rec:
        text = await rec.read()
    return text

# recognize specified file
loop = asyncio.get_event_loop()
task = loop.create_task(run_stream_recog("my_audio.mp3", recognizer, result_format=RecognitionResultFormat.text))
loop.run_until_complete(task)
print(task.result())

Usage

SDK contains two major parts:

  • Using Ender Turing REST API
  • Speech recognition

Using Ender Turing API

All API calls are accessible via an instance or EnderTuring. API methods are grouped, and each group is a property of EnderTuring. Examples:

from enderturing import Config, EnderTuring, RecognitionResultFormat

et = EnderTuring(Config.from_env())

# access sessions list
sessions = et.sessions.list()

# working with ASR
et.asr.get_instances(active_only=True)

# accessing raw json
et.raw.create_event(caller_id='1234', event_data={"type": "hold"})

Access Configuration

To access API, you need to know an authentication key (login), authentication secret (password), and installation URL (e.g. https://enderturing.yourcompany.com/)

There are multiple ways to pass config options:

  • from environmental variables (Config.from_env()).
  • creating Config with parameters (e.g. Config(auth_key="my_login", auth_secret="my_secret""))
  • using Enter Turing configuration URL (Config.from_url())

Creating Speech Recognizer

There two options to create a speech recognizer:

If you have access to API configured:

recognizer = et.get_speech_recognizer(language='en')

If you know URL and sample rate of desired ASR instance:

from enderturing import AsrConfig, SpeechRecognizer

config = AsrConfig(url="wss://enderturing", sample_rate=8000)
recognizer = SpeechRecognizer(config)

Recognizing a File

SpeechRecognizer.recognize_file method returns an async text stream. Depending on parameters, each line contains either a text of utterance or serialized JSON.

If you are only interested in results after recognition is complete, you can use the read() method. E.g.

async with recognizer.recognize_file("my_audio.wav", result_format=RecognitionResultFormat.text) as rec:
    text = await rec.read()

If you prefer getting words and phrases as soon as they are recognized - you can use the readline() method instead. E.g.

async with recognizer.recognize_file(src, result_format=RecognitionResultFormat.jsonl) as rec:
    line = await rec.readline()
    while line:
        # Now line contains a json string, you can save it or do something else with it
        line = await rec.readline()

Working With Multichannel Audio

If an audio file has more than one channel - by default system will recognize each channel and return a transcript for each channel. To change the default behavior - you can use channels parameter of SpeechRecognizer.recognize_file. Please check method documentation for details.

Sometimes an audio is stored as a file per channel, e.g., contact center call generates two files: one for a client and one for a support agent. But for analysis, it's preferable to see transcripts of the files merged as a dialog. In this scenario, you can use recognizer.recognize_joined_file([audio1, audio2]).

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enderturing-0.8.0.tar.gz (23.7 kB view details)

Uploaded Source

Built Distribution

enderturing-0.8.0-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file enderturing-0.8.0.tar.gz.

File metadata

  • Download URL: enderturing-0.8.0.tar.gz
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.8.9 Darwin/21.4.0

File hashes

Hashes for enderturing-0.8.0.tar.gz
Algorithm Hash digest
SHA256 9ef6bc6acdc9a133a716927f07159bfe3ffa28391c87af992498389d5e31d4b9
MD5 1fce07754b3942d8a0b517ec7345f6e8
BLAKE2b-256 b681914aafac3616c12982a506f9c02fad911c8a78eb7f42e7b04a1e491f10de

See more details on using hashes here.

File details

Details for the file enderturing-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: enderturing-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.8.9 Darwin/21.4.0

File hashes

Hashes for enderturing-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bab28ed2ba8de166ed111f3e9c37c9ad1c887d97897dc7c0e5b9b51c09db9d38
MD5 03afb3b14797c20a8aaad91532b8eb5c
BLAKE2b-256 3dee813025b7bcddee9c1d69a003674f0e2619052e5085aec69402e1cdb598bf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page