Skip to main content

SONATA: SOund and Narrative Advanced Transcription Assistant

Project description

SONATA

SOund and Narrative Advanced Transcription Assistant

SONATA is an advanced Automatic Speech Recognition (ASR) system that captures the symphony of human expression by recognizing and transcribing both verbal content and emotive sounds.

Features

  • High-accuracy speech-to-text transcription
  • Recognition of emotive sounds and non-verbal cues
  • Support for tags like <laugh>, <sigh>, <yawn>, <surprise>, <inhale>, <groan>, <cough>, <sneeze>, <sniffle>
  • Open-source and extensible architecture

Installation

Install the package from PyPI:

pip install sonata-asr

Or install from source:

git clone https://github.com/hwk06023/SONATA.git
cd SONATA
pip install -e .

Usage Examples

Basic Transcription

from sonata import Transcriber

# Initialize the transcriber
transcriber = Transcriber()

# Transcribe an audio file
result = transcriber.transcribe("path/to/audio.wav")
print(result)

Detecting Emotive Sounds

from sonata.core import EmotiveDetector

# Initialize the emotive detector
detector = EmotiveDetector(threshold=0.6)

# Detect emotive events in an audio file
events = detector.detect_events("path/to/audio.wav")

# Print the detected events
for event in events:
    print(f"{event.type}: {event.start_time:.2f}s - {event.end_time:.2f}s (confidence: {event.confidence:.2f})")

Full Pipeline

from sonata import Sonata

# Initialize SONATA with default settings
sonata = Sonata()

# Process audio file - transcribes speech and detects emotive sounds
result = sonata.process("path/to/audio.wav")

# Print the text with emotive tags
print(result.text_with_tags)

# Save the result
sonata.save_output(result, "output.json")

Command Line Interface

SONATA also provides a CLI for quick transcription:

# Basic usage
sonata-asr path/to/audio.wav

# Save output to specific file
sonata-asr path/to/audio.wav --output result.json

# Set threshold for emotive detection
sonata-asr path/to/audio.wav --threshold 0.7

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details. This license ensures that derivative works must also be open source and use the same license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sonata_asr-0.0.1.tar.gz (27.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sonata_asr-0.0.1-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file sonata_asr-0.0.1.tar.gz.

File metadata

  • Download URL: sonata_asr-0.0.1.tar.gz
  • Upload date:
  • Size: 27.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.1

File hashes

Hashes for sonata_asr-0.0.1.tar.gz
Algorithm Hash digest
SHA256 4233c17e3a63198d6a200860bfd5075e4644143356fcbac773cc5ebba384d6fc
MD5 59d41385d78afec5a2dcc450345a9669
BLAKE2b-256 0d2b8d6c2fedfeb52ca7522ec1ca4b208e39a3536f325960e7a6338a64ba9b18

See more details on using hashes here.

File details

Details for the file sonata_asr-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: sonata_asr-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.1

File hashes

Hashes for sonata_asr-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0464c161fa16d8620e06715484d70ace0b2c0fd068dc9b5a458e26d14ce2a7cc
MD5 225c24b456a8ba4f1ee0df3cfa22a948
BLAKE2b-256 3c4ab50696c75717f3a6c68554c90b1ebf17022e24f87072835f9324b32042a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page