Skip to main content

Utilities for transcribing audio files using the Whisper API.

Project description

Transcription with OpenAI's Whisper is very accurate, but it doesn't natively support speaker labelling (diarisation).

Existing libraries for diarisation like pyannote rely on audio features to separate and identify speakers, but are computationally expensive and often inaccurate. A common failure mode arises when the speaker changes their audio quality, such as when they move closer to or further from the microphone. This can cause the diarisation algorithm to incorrectly identify the speaker as a new person.

I had a simple hypothesis: the cues from transcribed speech are sufficient to identify speakers. I developed a pipeline which passes the transcribed text to GPT-4o with a prompt asking it to identify the speaker.

flowchart LR
    A[Input file #40;audio, video#41;]
    B[Whisper transcription]
    C[Text output]
    D[Label and tidy with GPT-4o]
    E[Output in user-defined format]
    A-- Convert to WAV -->B
    B-->C
    C-->D
    D-->C
    D-->E

Installation

pip install precisetranscribe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

precisetranscribe-0.1.1.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

precisetranscribe-0.1.1-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file precisetranscribe-0.1.1.tar.gz.

File metadata

  • Download URL: precisetranscribe-0.1.1.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for precisetranscribe-0.1.1.tar.gz
Algorithm Hash digest
SHA256 40e7b0c7064f8d3594266a53da38c4fc743b57748ccc1711cdc9cd5eb453a099
MD5 798b0c7a01e3edab533dfd41eb96a742
BLAKE2b-256 8a12b46683bb7dace06e80b66da96c133f1d86e8a0964ad240cad02f3a464725

See more details on using hashes here.

File details

Details for the file precisetranscribe-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for precisetranscribe-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8aef2ce78d371dc8050623c54b0d58c91d0a088a1916603c1ea2bebb160cbc3e
MD5 a84dd4571b402820d6bf025531419117
BLAKE2b-256 67b42d91cde302a192d4370bd4238f04d3d432c0c7910062a3681a04d339aefd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page