Simple and easy to use realtime speech to text
Project description
livestt
Installation
pip install livestt # this could take a while
Usage
Livestt gives access to 3 main classes/functions.
Wait for the wake word
from livestt import wait
def callback_func():
print("Wakeword said!")
wait(callback=callback_func)
The wait
function takes in these args:
callback
(Callable): The function to be called when the wake word is detected.
args
(tuple[any] | None): The arguments to be passed to the callback function. The default is None.
wake_word
(str): The wake word that the function is waiting for. The default is "Sheila".
prob_threshold
(float): The probability threshold for the wake word detection. The default is 0.5.
chunk_length_s
(float): The length of the audio chunk to be processed at a time, in seconds. The default is 2.0.
stream_chunk_s
(float): The length of the audio stream chunk to be processed at a time, in seconds. The default is 0.25.
debug
(bool): If True, debug information will be printed. The default is True.
Raises:
ValueError
: If the wake word is not in the set of valid class labels.
Returns:
None
Record audio
from livestt import Recorder
import time
recorder = Recorder("test.wav")
recorder.start() # Starts recorder thread
time.sleep(5) # Waits before ending thread
recorder.end() # Writes recording to "test.wav"
The Recorder
class when started starts a new recorder thread where it will listen to the audio until the thread is ended. Upon the thread ending, the recording will be saved to a file. The Recorder
class takes these args:
chunk
(int): The number of audio frames per buffer.
format
(int): The sample format for the recording.
channels
(int): The number of channels for the recording.
fs
(int): The sample rate of the recording.
filename
(str): The name of the output file where the recording will be saved. The file_ MUST currently be .wav
listening
(bool): A flag indicating whether the recorder is currently recording.
Transcribe a given audio file
from livestt import transcribe
transcription = transcribe("test.wav")
for t in transcription:
print(t.text)
The transcribe
function transcribes the given audio file and outputs the transcribed text along with other information. The transcribe
function takes these args:
input_file
(str): The path to the audio file to be transcribed.
language
(str): The language of the audio file. The default is "en" (English).
model_name
(str): The name of the model to be used for transcription. The default is "tiny.en".
This function yields a tuple with the following fields:
text
(str): The transcribed text.
language_probability
(float): The probability of the detected language.
language
(str): The detected language.
segment_end
(float): The end time of the transcribed segment.
segment_start
(float): The start time of the transcribed segment.
Examples
For a full example, check out the example in the file example/main.py
.
Tech stack
- Pyaudio for recording audio.
- faster-whisper for transcription.
- openWakeWord for wakeword detection.
Acknowledgments
Thanks to Kolja for the inspiration. I couldn't figure out how to use his library so I made my own. Check this out here.
Contribution
Contributions are always welcome! Open an issue or make a PR. Or just contact me on discord: @a3l6
Author(s)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file livestt-1.0.7.tar.gz
.
File metadata
- Download URL: livestt-1.0.7.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a34884cb885404f422bf2f6cc572eea00afac303941849670680349548626419 |
|
MD5 | 6f4f068747a0f0c0eb7da5c000a4478c |
|
BLAKE2b-256 | 7b01e514728fcf1fa444d3e72f70f1d397423b6002a24dfbb1e50cbbbcab3b0c |
File details
Details for the file livestt-1.0.7-py3-none-any.whl
.
File metadata
- Download URL: livestt-1.0.7-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e2d86b4046ff81f5d8c6b97bd1bde19b284f236a35364a12f3d1aec6c042f8e |
|
MD5 | 97fc5a78156c36cf58dd8a01e74f3700 |
|
BLAKE2b-256 | 37e839538bb6ab135b014fbd7045db18d10989709c9ce32d61f044c3dc9f1db7 |