Skip to main content

Utilities for transcribing audio files using the Whisper API.

Project description

Transcription with OpenAI's Whisper is very accurate, but it doesn't natively support speaker labelling (diarisation).

Existing libraries for diarisation like pyannote rely on audio features to separate and identify speakers, but are computationally expensive and often inaccurate. A common failure mode arises when the speaker changes their audio quality, such as when they move closer to or further from the microphone. This can cause the diarisation algorithm to incorrectly identify the speaker as a new person.

I had a simple hypothesis: the cues from transcribed speech are sufficient to identify speakers. I developed a pipeline which passes the transcribed text to GPT-4o with a prompt asking it to identify the speaker.

flowchart LR
    A[Input file #40;audio, video#41;]
    B[Whisper transcription]
    C[Text output]
    D[Label and tidy with GPT-4o]
    E[Output in user-defined format]
    A-- Convert to WAV -->B
    B-->C
    C-->D
    D-->C
    D-->E

Installation

pip install precisetranscribe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

precisetranscribe-0.1.7.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

precisetranscribe-0.1.7-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file precisetranscribe-0.1.7.tar.gz.

File metadata

  • Download URL: precisetranscribe-0.1.7.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for precisetranscribe-0.1.7.tar.gz
Algorithm Hash digest
SHA256 d0ceae486c421374f5af263a893fb9269ee621f8b5440294d7d12c862cdce6c1
MD5 7d3fa35163ecda422f7911e1f5d11f9a
BLAKE2b-256 20d4a493e26d5fb328ed6716de7ea6974d5e4a057af2278957021387929ecb27

See more details on using hashes here.

File details

Details for the file precisetranscribe-0.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for precisetranscribe-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 9102efe7793d7ec1bc5a370c346c14133f9c252a8d1038ed64e61f20bf5b2eed
MD5 d22ee5690b387bf922c96577c4354859
BLAKE2b-256 18453233474ccb371f5baf23da65b80b03c0aba301e3134addd04e35c636a4f8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page