Skip to main content

Utilities for transcribing audio files using the Whisper API.

Project description

Transcription with OpenAI's Whisper is very accurate, but it doesn't natively support speaker labelling (diarisation).

Existing libraries for diarisation like pyannote rely on audio features to separate and identify speakers, but are computationally expensive and often inaccurate. A common failure mode arises when the speaker changes their audio quality, such as when they move closer to or further from the microphone. This can cause the diarisation algorithm to incorrectly identify the speaker as a new person.

I had a simple hypothesis: the cues from transcribed speech are sufficient to identify speakers. I developed a pipeline which passes the transcribed text to GPT-4o with a prompt asking it to identify the speaker.

flowchart LR
    A[Input file #40;audio, video#41;]
    B[Whisper transcription]
    C[Text output]
    D[Label and tidy with GPT-4o]
    E[Output in user-defined format]
    A-- Convert to WAV -->B
    B-->C
    C-->D
    D-->C
    D-->E

Installation

pip install precisetranscribe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

precisetranscribe-0.1.3.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

precisetranscribe-0.1.3-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file precisetranscribe-0.1.3.tar.gz.

File metadata

  • Download URL: precisetranscribe-0.1.3.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for precisetranscribe-0.1.3.tar.gz
Algorithm Hash digest
SHA256 d96783d9ae101f670b51f41d4f7a0631983be96c12ab77726915ccb6b6057af4
MD5 d86ce8139ef46f1bab4a9056a56819e4
BLAKE2b-256 dada425f3bc08ca1d8b70c25921086dbe138870868b95f2f7143b7c83764637a

See more details on using hashes here.

File details

Details for the file precisetranscribe-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for precisetranscribe-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 07c2b1bca2eef1d20974c7b789e1547598aa0ff301cfdcc0ce36dfef3bb46b6f
MD5 6caf08b45a9cdf824e12280f246a4cb4
BLAKE2b-256 2864f770c06fa4130d219e5b6fadf1686eb99818dc1f2d33e8e35e65bbd085b2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page