Skip to main content

Utilities for transcribing audio files using the Whisper API.

Project description

Transcription with OpenAI's Whisper is very accurate, but it doesn't natively support speaker labelling (diarisation).

Existing libraries for diarisation like pyannote rely on audio features to separate and identify speakers, but are computationally expensive and often inaccurate. A common failure mode arises when the speaker changes their audio quality, such as when they move closer to or further from the microphone. This can cause the diarisation algorithm to incorrectly identify the speaker as a new person.

I had a simple hypothesis: the cues from transcribed speech are sufficient to identify speakers. I developed a pipeline which passes the transcribed text to GPT-4o with a prompt asking it to identify the speaker.

flowchart LR
    A[Input file #40;audio, video#41;]
    B[Whisper transcription]
    C[Text output]
    D[Label and tidy with GPT-4o]
    E[Output in user-defined format]
    A-- Convert to WAV -->B
    B-->C
    C-->D
    D-->C
    D-->E

Installation

pip install precisetranscribe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

precisetranscribe-0.2.1.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

precisetranscribe-0.2.1-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file precisetranscribe-0.2.1.tar.gz.

File metadata

  • Download URL: precisetranscribe-0.2.1.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for precisetranscribe-0.2.1.tar.gz
Algorithm Hash digest
SHA256 b7dc2ef8ee4262fef44909ace7c39610ab8934f579a4f7d4f206a53b4c8e4e2f
MD5 53b388a92054c2ac1d7a8bc16db30426
BLAKE2b-256 ae7bb3ced5996b86562dbcb1ded3e42657651c4dbfa5ed0a4a40a0dd5aad62e2

See more details on using hashes here.

File details

Details for the file precisetranscribe-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for precisetranscribe-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4b6bbe41ff29cd7b992196970f79b176557f4a0b511d734b16e918c6370de70b
MD5 4f84035126efee289711141e3a40f36b
BLAKE2b-256 ee0f9e0792831fc2e0f0e8332d02a9a36f1351a678ee6efacb6f8911afc605f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page