Skip to main content

Utilities for transcribing audio files using the Whisper API.

Project description

Transcription with OpenAI's Whisper is very accurate, but it doesn't natively support speaker labelling (diarisation).

Existing libraries for diarisation like pyannote rely on audio features to separate and identify speakers, but are computationally expensive and often inaccurate. A common failure mode arises when the speaker changes their audio quality, such as when they move closer to or further from the microphone. This can cause the diarisation algorithm to incorrectly identify the speaker as a new person.

I had a simple hypothesis: the cues from transcribed speech are sufficient to identify speakers. I developed a pipeline which passes the transcribed text to GPT-4o with a prompt asking it to identify the speaker.

flowchart LR
    A[Input file #40;audio, video#41;]
    B[Whisper transcription]
    C[Text output]
    D[Label and tidy with GPT-4o]
    E[Output in user-defined format]
    A-- Convert to WAV -->B
    B-->C
    C-->D
    D-->C
    D-->E

Installation

pip install precisetranscribe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

precisetranscribe-0.1.2-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file precisetranscribe-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for precisetranscribe-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c7cd0a9056606ff9e5d3a85cbb904d3af3b32d2260d8f536caa49d10b43fdce6
MD5 3ee3b2450c50f492a2f733b9a7f8e6aa
BLAKE2b-256 cd7252a5959bc179aa4fd7201f0bb401617344c410b7dc8222dbfd2147a6661f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page