llama-index readers assemblyai integration

These details have not been verified by PyPI

Project description

AssemblyAI Audio Transcript Loader

pip install llama-index-readers-assemblyai

The AssemblyAI Audio Transcript Loader allows to transcribe audio files with the AssemblyAI API and loads the transcribed text into documents.

To use it, you should have the assemblyai python package installed, and the environment variable ASSEMBLYAI_API_KEY set with your API key. Alternatively, the API key can also be passed as an argument.

More info about AssemblyAI:

Usage

The AssemblyAIAudioTranscriptReader needs at least the file_path argument. Audio files can be specified as an URL or a local file path.

from llama_index.readers.assemblyai import AssemblyAIAudioTranscriptReader

audio_file = "https://storage.googleapis.com/aai-docs-samples/nbc.mp3"
# or a local file path: audio_file = "./nbc.mp3"

reader = AssemblyAIAudioTranscriptReader(file_path=audio_file)

docs = reader.load_data()

Note: Calling reader.load_data() blocks until the transcription is finished.

The transcribed text is available in the text:

docs[0].text
# "Load time, a new president and new congressional makeup. Same old ..."

The metadata contains the full JSON response with more meta information:

docs[0].metadata
# {'language_code': <LanguageCode.en_us: 'en_us'>,
#  'audio_url': 'https://storage.googleapis.com/aai-docs-samples/nbc.mp3',
#  'punctuate': True,
#  'format_text': True,
#   ...
# }

Transcript Formats

You can specify the transcript_format argument for different formats.

Depending on the format, one or more documents are returned. These are the different TranscriptFormat options:

TEXT: One document with the transcription text
SENTENCES: Multiple documents, splits the transcription by each sentence
PARAGRAPHS: Multiple documents, splits the transcription by each paragraph
SUBTITLES_SRT: One document with the transcript exported in SRT subtitles format
SUBTITLES_VTT: One document with the transcript exported in VTT subtitles format

from llama_index.readers.assemblyai import TranscriptFormat

reader = AssemblyAIAudioTranscripReader(
    file_path="./your_file.mp3",
    transcript_format=TranscriptFormat.SENTENCES,
)

docs = reader.load_data()

Transcription Config

You can also specify the config argument to use different audio intelligence models.

Visit the AssemblyAI API Documentation to get an overview of all available models!

import assemblyai as aai

config = aai.TranscriptionConfig(
    speaker_labels=True, auto_chapters=True, entity_detection=True
)

reader = AssemblyAIAudioTranscriptReader(
    file_path="./your_file.mp3", config=config
)

Pass the API Key as argument

Next to setting the API key as environment variable ASSEMBLYAI_API_KEY, it is also possible to pass it as argument.

reader = AssemblyAIAudioTranscriptReader(
    file_path="./your_file.mp3", api_key="YOUR_KEY"
)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.0

Mar 12, 2026

This version

0.4.1

Sep 8, 2025

0.4.0

Jul 30, 2025

0.3.0

Nov 18, 2024

0.2.0

Aug 22, 2024

0.1.3

Feb 21, 2024

0.1.2

Feb 13, 2024

0.1.1

Feb 12, 2024

0.1.0

Feb 10, 2024

0.0.1

Feb 4, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_assemblyai-0.4.1.tar.gz (5.0 kB view details)

Uploaded Sep 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_index_readers_assemblyai-0.4.1-py3-none-any.whl (4.9 kB view details)

Uploaded Sep 8, 2025 Python 3

File details

Details for the file llama_index_readers_assemblyai-0.4.1.tar.gz.

File metadata

Download URL: llama_index_readers_assemblyai-0.4.1.tar.gz
Upload date: Sep 8, 2025
Size: 5.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for llama_index_readers_assemblyai-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`9b6ba133de55f6ff782aa4c74607e6f250523f0643247053332a6d390cf16203`
MD5	`774ea55c7bf38c536c2a5b522a0369b4`
BLAKE2b-256	`8c97e5d3e6bf1d23fe440fbf6b38f1349f9117caaa85bd96b882f08037144230`

See more details on using hashes here.

File details

Details for the file llama_index_readers_assemblyai-0.4.1-py3-none-any.whl.

File metadata

Download URL: llama_index_readers_assemblyai-0.4.1-py3-none-any.whl
Upload date: Sep 8, 2025
Size: 4.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for llama_index_readers_assemblyai-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`85beff37002db3b27767197f244b3209d0d6c6492d0cf53cee2e2129ccc638ab`
MD5	`1a7730a1d4024835733b42475f2804b1`
BLAKE2b-256	`e40e68ce22136ffb2e9fc370c235c3fb52a2346e7199e25efd687e599c25de46`

See more details on using hashes here.

llama-index-readers-assemblyai 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

AssemblyAI Audio Transcript Loader

Usage

Transcript Formats

Transcription Config

Pass the API Key as argument

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes