Skip to main content

Python package for AssemblyAI

Project description

assemblyai

Accurately recognize speech in your application with AssemblyAI.

You can also train custom models to more accurately recognize speech in your application, and expand vocabulary with custom words like product/person names.

Documentation:

Slack community: https://docs.assemblyai.com/help/#slack

Issues: https://github.com/assemblyai/assemblyai-python-sdk

Getting started

Run pip install and email support@assemblyai.com for an API token (we reply at most within a few hours).

pip install assemblyai

Quickstart

Start transcribing:

import assemblyai

aai = assemblyai.Client(token='your-secret-api-token')

transcript = aai.transcribe(filename='/path/to/example.wav')

Get the completed transcript. Transcripts take about half the duration of the audio to complete.

while transcript.status != 'completed':
    transcript = transcript.get()

text = transcript.text

Instead of a local file, you can also specify a url for the audio file:

transcript = aai.transcribe(audio_url='https://example.com/example.wav')

Custom models

The quickstart example transcribes audio using a generic English model.

In order to boost accuracy and recognize custom words, you can create a custom model. You can read more about how custom model work in the docs.

Create a custom model.

import assemblyai

aai = assemblyai.Client(token='your-secret-api-token')

# phrases is a list or words (real or made up) and sentences that you want to recognize
phrases = ["foobar", "Dirk Gently", "electric monk", "yourLingoHere",
           "perhaps a common phrase here", "and a common response"]

model = aai.train(phrases)

Check to see that the model has finished training -- models take about six minutes to complete.

while model.status != 'trained':
    model = model.get()

Reference the model when creating a transcript.

transcript = aai.transcribe(filename='/path/to/example.wav', model=model)

Transcribing stereo audio with two speakers on different channels

For stereo audio with two speakers on separate channels, you can leverage enhanced accuracy and formatting by setting speak_count to 2.

transcript = aai.transcribe('example.wav', speaker_count=2)

Transcribing without formatted text

To receive transcript text without formatting or punctuation, set the option format_text to False (default is True).

transcript = aai.transcribe('example.wav', format_text=False)

Model and Transcript attributes

Prior models and transcripts can be called by ID.

model = aai.model.get(id=<id>)
transcript = aai.transcript.get(id=<id>)

To inspect additional attributes, use props():

model.props()

>>> ['headers',
>>>  'id',
>>>  'status',
>>>  'name',
>>>  'phrases',
>>>  'warning',
>>>  'dict']

transcript.props()

>>> ['headers',
>>>  'id',
>>>  'audio_url',
>>>  'model',
>>>  'status',
>>>  'warning',
>>>  'text',
>>>  'text_raw',
>>>  'confidence',
>>>  'segments',
>>>  'speaker_count',
>>>  'format_text',
>>>  'dict']

The dict attribute contains the raw API response:

model.dict
transcript.dict

For additional background see: https://docs.assemblyai.com

Troubleshooting

Enable verbose logging by enabling the Client debug option:

import assemblyai

aai = assemblyai.Client(debug=True)

More options to get unstuck:

Development

Install dev requirements, install from source and run tests.

pip install -r requirements_dev.txt
python setup.py install
tox

Contributing

Bug reports and pull requests welcome.

Release notes

0.2.4 - Added examples for speaker_count and format_text options.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
assemblyai-0.2.4-py2.py3-none-any.whl (7.7 kB) Copy SHA256 hash SHA256 Wheel py2.py3
assemblyai-0.2.4.tar.gz (69.7 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page