Transcribes audio files

These details have not been verified by PyPI

Project links

Homepage

Project description

Transcribes audio files

pip install audiotranser

Tested against Windows 10 / Python 3.10 / Anaconda

Uses the models from https://huggingface.co/ggerganov/whisper.cpp/tree/main

    Args:
        inputfile: path to the input audio file
        small_large: model size (small or large)
        blas: use BLAS library for faster decoding
        silence_threshold: silence threshold in milliseconds
        min_silence_len: minimum silence length in milliseconds
        keep_silence: minimum silence length to keep after silence removal
        threads: number of threads to use
        processors: number of processors to use
        offset_t: time offset in milliseconds
        offset_n: segment index offset
        duration: duration of audio to process in milliseconds
        max_context: maximum number of text context tokens to store
        max_len: maximum segment length in characters
        best_of: number of best candidates to keep
        beam_size: beam size for beam search
        word_thold: word timestamp probability threshold
        entropy_thold: entropy threshold for decoder fail
        logprob_thold: log probability threshold for decoder fail
        speed_up: speed up audio by x2 (reduced accuracy)
        translate: translate from source language to english
        diarize: stereo audio diarization
        language: spoken language ('auto' for auto_detect)

    Returns:
        Pandas DataFrame with the results of the inference or the path to the output CSV file if pd.read_csv fails.

from audiotranser import transcribe_audio
df = transcribe_audio(
    inputfile=r"C:\untitled.wav",
    small_large="large",
    blas=True,
    silence_threshold=-30,  # ignored if == 0 or None
    min_silence_len=500,  # ignored if silence_threshold == 0 or None
    keep_silence=1000,  # ignored if silence_threshold == 0 or None
    threads=3,  # number of threads to use during computation
    processors=1,  # number of processors to use during computation
    offset_t=0,  # time offset in milliseconds
    offset_n=0,  # segment index offset
    duration=0,  # duration of audio to process in milliseconds
    max_context=-1,  # maximum number of text context tokens to store
    max_len=0,  # maximum segment length in characters
    best_of=5,  # number of best candidates to keep
    beam_size=-1,  # beam size for beam search
    word_thold=0.01,  # word timestamp probability threshold
    entropy_thold=2.40,  # entropy threshold for decoder fail
    logprob_thold=-1.00,  # log probability threshold for decoder fail
    speed_up=True,  # speed up audio by x2 (reduced accuracy)
    translate=False,  # translate from source language to english
    diarize=False,  # stereo audio diarization
    language="en",  # spoken language ('auto' for auto_detect)
)
print(df)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.10

Aug 6, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiotranser-0.10.tar.gz (14.0 MB view details)

Uploaded Aug 6, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audiotranser-0.10-py3-none-any.whl (14.2 MB view details)

Uploaded Aug 6, 2023 Python 3

File details

Details for the file audiotranser-0.10.tar.gz.

File metadata

Download URL: audiotranser-0.10.tar.gz
Upload date: Aug 6, 2023
Size: 14.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for audiotranser-0.10.tar.gz
Algorithm	Hash digest
SHA256	`f60c1f2b32d281365efbcb1ee8a01f2788eb61f6b4e004e6ffb659952a2b4253`
MD5	`258564d7a0a32b48ab05b29dd42089d3`
BLAKE2b-256	`29959de2149217988d09d7836d3aa2f695f02ae85df1e003e671361083fece2b`

See more details on using hashes here.

File details

Details for the file audiotranser-0.10-py3-none-any.whl.

File metadata

Download URL: audiotranser-0.10-py3-none-any.whl
Upload date: Aug 6, 2023
Size: 14.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for audiotranser-0.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e6d51355d5086f44ce3f2b0a43faffb7af58a2c3de06a960d21a6862dd7d765`
MD5	`36dbed10f37d5af710339c97600a26c9`
BLAKE2b-256	`380f8c0a2ec09dd91caf445112310574ddd0e217598ccff9a166ff4c66ed37e1`

See more details on using hashes here.

audiotranser 0.10

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Transcribes audio files

pip install audiotranser

Tested against Windows 10 / Python 3.10 / Anaconda

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes