Utilities for transcribing audio files using the Whisper API.
Project description
Transcription with OpenAI's Whisper is very accurate, but it doesn't natively support speaker labelling (diarisation).
Existing libraries for diarisation like pyannote rely on audio features to separate and identify speakers, but are computationally expensive and often inaccurate. A common failure mode arises when the speaker changes their audio quality, such as when they move closer to or further from the microphone. This can cause the diarisation algorithm to incorrectly identify the speaker as a new person.
I had a simple hypothesis: the cues from transcribed speech are sufficient to identify speakers. I developed a pipeline which passes the transcribed text to GPT-4o with a prompt asking it to identify the speaker.
flowchart LR
A[Input file #40;audio, video#41;]
B[Whisper transcription]
C[Text output]
D[Label and tidy with GPT-4o]
E[Output in user-defined format]
A-- Convert to WAV -->B
B-->C
C-->D
D-->C
D-->E
Installation
pip install precisetranscribe
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file precisetranscribe-0.1.5.tar.gz
.
File metadata
- Download URL: precisetranscribe-0.1.5.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b39a111d6a5ef1bf34dfcaadbc7fc6245c20a41883141452be49328fd5c6d63 |
|
MD5 | ada9f82ba9dccbca1ce675bedabd7126 |
|
BLAKE2b-256 | 4c2a7e472573dc6ca736ab29ac161b9b054eb674b86a509e04716361bc941544 |
File details
Details for the file precisetranscribe-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: precisetranscribe-0.1.5-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49b3ae1f5a92e8895258415e65b6ea2f88f14f7f7e83b962a692716899bd7977 |
|
MD5 | 65666340602ed60ada4b70727fba3e99 |
|
BLAKE2b-256 | b4b4fb4f58d9629118ad711d9dd699c6a1f7fe47f945e497ca89d477b7913572 |