Skip to main content

Python SDK for EnderTuring speech toolkit

Project description

Ender Turing

Ender Turing is a solution for voice content understanding, analytics and business insights. Check enderturing.com for details.

Installation

$ pip install enderturing

For using streaming speech recognition functions, you'll also need FFmpeg installed.

Ubuntu:

$ sudo apt install ffmpeg

MacOS homebrew:

$ brew install ffmpeg

For other OS, please follow FFmpeg installation guides.

Quick Start

import asyncio
from enderturing import Config, EnderTuring, RecognitionResultFormat

# create configuration
config = Config.from_url("https://admin%40local.enderturing.com:your_password@enterturing.yourcompany.com")
et = EnderTuring(config)

# access sessions list
sessions = et.sessions.list()
print(sessions)

# get recognizer for one of configured languages
recognizer = et.get_speech_recognizer(language='en')

async def run_stream_recog(f, r, result_format):
    async with r.stream_recognize(f, result_format=result_format) as rec:
        text = await rec.read()
    return text

# recognize specified file
loop = asyncio.get_event_loop()
task = loop.create_task(run_stream_recog("my_audio.mp3", recognizer, result_format=RecognitionResultFormat.text))
loop.run_until_complete(task)
print(task.result())

Usage

SDK contains two major parts:

  • Using Ender Turing REST API
  • Speech recognition

Using Ender Turing API

All API calls are accessible via an instance or EnderTuring. API methods are grouped, and each group is a property of EnderTuring. Examples:

from enderturing import Config, EnderTuring, RecognitionResultFormat

et = EnderTuring(Config.from_env())

# access sessions list
sessions = et.sessions.list()

# working with ASR
et.asr.get_instances(active_only=True)

# accessing raw json
et.raw.create_event(caller_id='1234', event_data={"type": "hold"})

Access Configuration

To access API, you need to know an authentication key (login), authentication secret (password), and installation URL (e.g. https://enderturing.yourcompany.com/)

There are multiple ways to pass config options:

  • from environmental variables (Config.from_env()).
  • creating Config with parameters (e.g. Config(auth_key="my_login", auth_secret="my_secret""))
  • using Enter Turing configuration URL (Config.from_url())

Creating Speech Recognizer

There two options to create a speech recognizer:

If you have access to API configured:

recognizer = et.get_speech_recognizer(language='en')

If you know URL and sample rate of desired ASR instance:

from enderturing import AsrConfig, SpeechRecognizer

config = AsrConfig(url="wss://enderturing", sample_rate=8000)
recognizer = SpeechRecognizer(config)

Recognizing a File

SpeechRecognizer.recognize_file method returns an async text stream. Depending on parameters, each line contains either a text of utterance or serialized JSON.

If you are only interested in results after recognition is complete, you can use the read() method. E.g.

async with recognizer.recognize_file("my_audio.wav", result_format=RecognitionResultFormat.text) as rec:
    text = await rec.read()

If you prefer getting words and phrases as soon as they are recognized - you can use the readline() method instead. E.g.

async with recognizer.recognize_file(src, result_format=RecognitionResultFormat.jsonl) as rec:
    line = await rec.readline()
    while line:
        # Now line contains a json string, you can save it or do something else with it
        line = await rec.readline()

Working With Multichannel Audio

If an audio file has more than one channel - by default system will recognize each channel and return a transcript for each channel. To change the default behavior - you can use channels parameter of SpeechRecognizer.recognize_file. Please check method documentation for details.

Sometimes an audio is stored as a file per channel, e.g., contact center call generates two files: one for a client and one for a support agent. But for analysis, it's preferable to see transcripts of the files merged as a dialog. In this scenario, you can use recognizer.recognize_joined_file([audio1, audio2]).

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enderturing-0.11.0.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

enderturing-0.11.0-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file enderturing-0.11.0.tar.gz.

File metadata

  • Download URL: enderturing-0.11.0.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.12 Linux/6.12.57

File hashes

Hashes for enderturing-0.11.0.tar.gz
Algorithm Hash digest
SHA256 2d3154ea1b6f1cb256f75a0d1ac3d66e629cb3f09ad1ceac933de7fd3627a0fa
MD5 3ac33e2f08d8912a01e5463ee074c79d
BLAKE2b-256 6ab681553446b9b1761e73f9f87da266e76190319feb6035cfcfc4e9de21bc73

See more details on using hashes here.

File details

Details for the file enderturing-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: enderturing-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 30.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.12 Linux/6.12.57

File hashes

Hashes for enderturing-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9d4d8e17d668b338885e894d758f19fa147915e8aac05ac0f890733c085144a9
MD5 7946051619db01d10159f9f3d64d122e
BLAKE2b-256 1cd8709f3ea30ce97f794501f9d55aa558eba667c254c510108eda0e4a219d55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page