A Python package for recording, transcribing, and converting audio

These details have not been verified by PyPI

Project links

Homepage

Project description

AudioProcessor

AudioProcessor is a Python class that provides functionality for recording audio, transcribing audio to text, and converting text to audio.

Class Methods

`init(self, frame_duration_ms=30, sample_rate=16000, chunk_size=1024, vad_mode=2)`

Initialize the AudioProcessor class.

Parameters:

frame_duration_ms: Frame duration in milliseconds (default: 30 ms)
sample_rate: Audio sample rate (default: 16000 Hz)
chunk_size: Audio chunk size for processing (default: 1024)
vad_mode: Voice Activity Detection (VAD) mode (default: 2)

`record_audio(self, output_file_path)`

Record audio from the microphone and save it to a .wav file.

Parameters:

output_file_path: Output file path for the recorded audio

`audio_to_text(self, audio_file_path, **kwargs)`

Transcribe an audio file.

Parameters:

audio_file_path: Input audio file path
**kwargs: Keyword arguments for preprocessing and other options

Returns:

Transcribed text as a string

`text_to_audio(self, text, audio_file_path, lang='en', volume=1.0, sample_rate=44100, bit_depth=16)`

Convert text to an audio file and play it.

Parameters:

text: Input text to convert to audio
audio_file_path: Output audio file path
lang: Language of the text (default: 'en')
volume: Volume level of the output audio (default: 1.0)
sample_rate: Audio sample rate (default: 44100 Hz)
bit_depth: Audio bit depth (default: 16)

Private Methods

These methods are used internally by the class and should not be called directly by the user.

`_save_audio_to_file(self, audio, file_path)`

Save the audio data to a .wav file.

`_process_audio_text(self, audio_text, **kwargs)`

Process the audio_text and return the transcribed text.

`_parse_recognition_response(self, response, show_all)`

Parse the response from the speech recognizer.

`_process_recognized_audio(self, audio)`

Process the recognized audio and return the transcribed text.

`_play_audio_file(self, audio_file_path, volume=1.0, normalize=False)`

Play the audio file with the specified volume and optional normalization.

`_generate_frames(self, audio, sample_rate, frame_duration_ms)`

Generate audio frames from raw audio data.

`_collect_voiced_segments(self, frames, vad, sample_rate, ratio=2)`

Collect voiced segments from audio frames using the Voice Activity Detection (VAD) algorithm.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.1

May 7, 2023

0.2.0

Apr 30, 2023

This version

0.1.4

Apr 28, 2023

0.1.3

Apr 28, 2023

0.1.2

Apr 28, 2023

0.1.1 yanked

Apr 28, 2023

Reason this release was yanked:

Deleted wrong thing

0.1.0

Apr 28, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AudioProcessor-0.1.4.tar.gz (4.5 kB view hashes)

Uploaded Apr 28, 2023 Source

Built Distribution

AudioProcessor-0.1.4-py3-none-any.whl (5.1 kB view hashes)

Uploaded Apr 28, 2023 Python 3

Hashes for AudioProcessor-0.1.4.tar.gz

Hashes for AudioProcessor-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`6220acd31bf13a2421ebd9c03feae96792b139947399134bddfe37262a8dde96`
MD5	`49f19dabaadc6e8f1e932314b5f3d16b`
BLAKE2b-256	`442e2278e14f76b2dc9fdaff055766195102a3869e2a85e6137e0547ff2492ca`

Hashes for AudioProcessor-0.1.4-py3-none-any.whl

Hashes for AudioProcessor-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5abfcc3b19c93109cd24ae07e564cde09fd118ed6f4871368b30a8c59fe6e4f3`
MD5	`c4a56bcddeb521d20e77bf7cef29c71b`
BLAKE2b-256	`00c8a2a3565cf0c57abf349a6f4c6bc2679b2391fa97870e7e9a83426381c158`

AudioProcessor 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AudioProcessor

Class Methods