SONATA: SOund and Narrative Advanced Transcription Assistant
Project description
SONATA
SOund and Narrative Advanced Transcription Assistant
SONATA is an advanced Automatic Speech Recognition (ASR) system that captures the symphony of human expression by recognizing and transcribing both verbal content and emotive sounds.
Features
- High-accuracy speech-to-text transcription
- Recognition of emotive sounds and non-verbal cues
- Support for tags like
<laugh>,<sigh>,<yawn>,<surprise>,<inhale>,<groan>,<cough>,<sneeze>,<sniffle> - Open-source and extensible architecture
Installation
Install the package from PyPI:
pip install sonata-asr
Or install from source:
git clone https://github.com/hwk06023/SONATA.git
cd SONATA
pip install -e .
Usage Examples
Basic Transcription
from sonata import Transcriber
# Initialize the transcriber
transcriber = Transcriber()
# Transcribe an audio file
result = transcriber.transcribe("path/to/audio.wav")
print(result)
Detecting Emotive Sounds
from sonata.core import EmotiveDetector
# Initialize the emotive detector
detector = EmotiveDetector(threshold=0.6)
# Detect emotive events in an audio file
events = detector.detect_events("path/to/audio.wav")
# Print the detected events
for event in events:
print(f"{event.type}: {event.start_time:.2f}s - {event.end_time:.2f}s (confidence: {event.confidence:.2f})")
Full Pipeline
from sonata import Sonata
# Initialize SONATA with default settings
sonata = Sonata()
# Process audio file - transcribes speech and detects emotive sounds
result = sonata.process("path/to/audio.wav")
# Print the text with emotive tags
print(result.text_with_tags)
# Save the result
sonata.save_output(result, "output.json")
Command Line Interface
SONATA also provides a CLI for quick transcription:
# Basic usage
sonata-asr path/to/audio.wav
# Save output to specific file
sonata-asr path/to/audio.wav --output result.json
# Set threshold for emotive detection
sonata-asr path/to/audio.wav --threshold 0.7
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details. This license ensures that derivative works must also be open source and use the same license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sonata_asr-0.0.1.tar.gz.
File metadata
- Download URL: sonata_asr-0.0.1.tar.gz
- Upload date:
- Size: 27.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4233c17e3a63198d6a200860bfd5075e4644143356fcbac773cc5ebba384d6fc
|
|
| MD5 |
59d41385d78afec5a2dcc450345a9669
|
|
| BLAKE2b-256 |
0d2b8d6c2fedfeb52ca7522ec1ca4b208e39a3536f325960e7a6338a64ba9b18
|
File details
Details for the file sonata_asr-0.0.1-py3-none-any.whl.
File metadata
- Download URL: sonata_asr-0.0.1-py3-none-any.whl
- Upload date:
- Size: 28.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0464c161fa16d8620e06715484d70ace0b2c0fd068dc9b5a458e26d14ce2a7cc
|
|
| MD5 |
225c24b456a8ba4f1ee0df3cfa22a948
|
|
| BLAKE2b-256 |
3c4ab50696c75717f3a6c68554c90b1ebf17022e24f87072835f9324b32042a3
|