Skip to main content

Python SDK for EnderTuring speech toolkit

Project description

Ender Turing

Ender Turing is a solution for voice content understanding, analytics and business insights. Check enderturing.com for details.

Installation

$ pip install enderturing

For using streaming speech recognition functions, you'll also need FFmpeg installed.

Ubuntu:

$ sudo apt install ffmpeg

MacOS homebrew:

$ brew install ffmpeg

For other OS, please follow FFmpeg installation guides.

Quick Start

import asyncio
from enderturing import Config, EnderTuring, RecognitionResultFormat

# create configuration
config = Config.from_url("https://admin%40local.enderturing.com:your_password@enterturing.yourcompany.com")
et = EnderTuring(config)

# access sessions list
sessions = et.sessions.list()
print(sessions)

# get recognizer for one of configured languages
recognizer = et.get_speech_recognizer(language='en')

async def run_stream_recog(f, r, result_format):
    async with r.stream_recognize(f, result_format=result_format) as rec:
        text = await rec.read()
    return text

# recognize specified file
loop = asyncio.get_event_loop()
task = loop.create_task(run_stream_recog("my_audio.mp3", recognizer, result_format=RecognitionResultFormat.text))
loop.run_until_complete(task)
print(task.result())

Usage

SDK contains two major parts:

  • Using Ender Turing REST API
  • Speech recognition

Using Ender Turing API

All API calls are accessible via an instance or EnderTuring. API methods are grouped, and each group is a property of EnderTuring. Examples:

from enderturing import Config, EnderTuring, RecognitionResultFormat

et = EnderTuring(Config.from_env())

# access sessions list
sessions = et.sessions.list()

# working with ASR
et.asr.get_instances(active_only=True)

# accessing raw json
et.raw.create_event(caller_id='1234', event_data={"type": "hold"})

Access Configuration

To access API, you need to know an authentication key (login), authentication secret (password), and installation URL (e.g. https://enderturing.yourcompany.com/)

There are multiple ways to pass config options:

  • from environmental variables (Config.from_env()).
  • creating Config with parameters (e.g. Config(auth_key="my_login", auth_secret="my_secret""))
  • using Enter Turing configuration URL (Config.from_url())

Creating Speech Recognizer

There two options to create a speech recognizer:

If you have access to API configured:

recognizer = et.get_speech_recognizer(language='en')

If you know URL and sample rate of desired ASR instance:

from enderturing import AsrConfig, SpeechRecognizer

config = AsrConfig(url="wss://enderturing", sample_rate=8000)
recognizer = SpeechRecognizer(config)

Recognizing a File

SpeechRecognizer.recognize_file method returns an async text stream. Depending on parameters, each line contains either a text of utterance or serialized JSON.

If you are only interested in results after recognition is complete, you can use the read() method. E.g.

async with recognizer.recognize_file("my_audio.wav", result_format=RecognitionResultFormat.text) as rec:
    text = await rec.read()

If you prefer getting words and phrases as soon as they are recognized - you can use the readline() method instead. E.g.

async with recognizer.recognize_file(src, result_format=RecognitionResultFormat.jsonl) as rec:
    line = await rec.readline()
    while line:
        # Now line contains a json string, you can save it or do something else with it
        line = await rec.readline()

Working With Multichannel Audio

If an audio file has more than one channel - by default system will recognize each channel and return a transcript for each channel. To change the default behavior - you can use channels parameter of SpeechRecognizer.recognize_file. Please check method documentation for details.

Sometimes an audio is stored as a file per channel, e.g., contact center call generates two files: one for a client and one for a support agent. But for analysis, it's preferable to see transcripts of the files merged as a dialog. In this scenario, you can use recognizer.recognize_joined_file([audio1, audio2]).

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enderturing-0.5.0.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

enderturing-0.5.0-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file enderturing-0.5.0.tar.gz.

File metadata

  • Download URL: enderturing-0.5.0.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.5 Darwin/20.6.0

File hashes

Hashes for enderturing-0.5.0.tar.gz
Algorithm Hash digest
SHA256 70f92a0bcc4b96a298daf45925ef60143749cff6447dec04d310e0fd0396d82a
MD5 59843c6bee122b847ace3dab865a133c
BLAKE2b-256 7f695d98267d8f254975f919135de4de932373004fbc62f9256e4584103b8823

See more details on using hashes here.

File details

Details for the file enderturing-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: enderturing-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.5 Darwin/20.6.0

File hashes

Hashes for enderturing-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bf613635cf124ce05a0da8562bd16233d7838e4539ba5ee501c88463b3696bfa
MD5 1b6245efe0ac110445cae22616fd2ede
BLAKE2b-256 06e0b3ec9f652f1c40341f6dc65021c3e0e3029c12082989779507eae160a269

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page