Skip to main content

Utilities for transcribing audio files using the Whisper API.

Project description

Transcription with OpenAI's Whisper is very accurate, but it doesn't natively support speaker labelling (diarisation).

Existing libraries for diarisation like pyannote rely on audio features to separate and identify speakers, but are computationally expensive and often inaccurate. A common failure mode arises when the speaker changes their audio quality, such as when they move closer to or further from the microphone. This can cause the diarisation algorithm to incorrectly identify the speaker as a new person.

I had a simple hypothesis: the cues from transcribed speech are sufficient to identify speakers. I developed a pipeline which passes the transcribed text to GPT-4o with a prompt asking it to identify the speaker.

flowchart LR
    A[Input file #40;audio, video#41;]
    B[Whisper transcription]
    C[Text output]
    D[Label and tidy with GPT-4o]
    E[Output in user-defined format]
    A-- Convert to WAV -->B
    B-->C
    C-->D
    D-->C
    D-->E

Installation

pip install precisetranscribe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

precisetranscribe-0.1.6.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

precisetranscribe-0.1.6-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file precisetranscribe-0.1.6.tar.gz.

File metadata

  • Download URL: precisetranscribe-0.1.6.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for precisetranscribe-0.1.6.tar.gz
Algorithm Hash digest
SHA256 1d146a55ef8766c49bfbb98716fae37df9e86d0b4b72d2e554d4b5e32cb19274
MD5 d29d97e1ce9e6b1e3e8c761ecc4d34e5
BLAKE2b-256 5df81aa67a47a362078bcc132684fdd66fe1bc9df7184567be4544cd7921bf66

See more details on using hashes here.

File details

Details for the file precisetranscribe-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for precisetranscribe-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 2241aeb25d414b3273ed3a445b5132113973101b618a8fb3b275116102d2baf0
MD5 49e3266f4c93e8506a95e43a0356b90b
BLAKE2b-256 d5b29d5f494817dd5418d3670bcfb5bed27c94beed5eb173ad4cf28e1890d9c5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page