Audio/lyric synchronisation — dual transcription, word-level alignment, gap detection, timeline validation

These details have not been verified by PyPI

Project description

lyric-sync

Audio/lyric synchronisation for the AIMOScript video generation pipeline. Built from the real-world debugging of the MusicVideoCreator project — specifically designed to eliminate the negative durations, backwards timestamps, and reused-word problems from the earlier implementation.

What it solves

Previous problem	Solution
Negative durations (`-42.37s`)	`TimelineValidator` 5-pass repair
Backwards timestamps (end before start)	`ExclusionPoolMatcher` prevents word reuse
Same transcription words matched to multiple lines	Exclusion pool — once a word is used, it's gone
Swedish encoding garbage (`Ã¤` instead of `ä`)	`fix_swedish_encoding()` in all text paths
ElevenLabs timestamps ignored in favour of fixed durations	Consensus builder uses ElevenLabs timing as ground truth
Excessive instrumental segments	`min_gap_duration` threshold (default 1.5s)

Installation

pip install lyric-sync                      # core only
pip install "lyric-sync[openai]"            # + Whisper transcription
pip install "lyric-sync[elevenlabs]"        # + ElevenLabs Scribe transcription
pip install "lyric-sync[all]"               # all providers + pydub

Set API keys:

OPENAI_API_KEY=sk-...
ELEVENLABS_API_KEY=...

Quick start

from lyric_sync import LyricSyncer

syncer = LyricSyncer()
result = syncer.sync(
    audio_path  = "conny.mp3",
    lyrics_path = "conny.txt",
)

for seg in result.segments:
    print(f"{seg.start_time:.2f}–{seg.end_time:.2f}  {seg.text}")

# Export for video renderer
from lyric_sync.exporter.json_exporter import JSONExporter
JSONExporter().export(result, "output/conny/")

The pipeline

audio + lyrics
      │
  ① Transcribe
      ├─ OpenAI Whisper (verbose_json, word timestamps)
      └─ ElevenLabs Scribe (word timestamps, 99 languages)
      │
  ② Consensus merge
      └─ ElevenLabs preferred for Swedish; fills gaps from Whisper
      │
  ③ Align (ExclusionPoolMatcher)
      └─ Sliding window fuzzy match, exclusion pool prevents reuse
      │
  ④ Interpolate missing
      └─ Linear interpolation between anchors for unmatched lines
      │
  ⑤ Detect instrumental gaps
      └─ [Intro] / [Instrumental] / [Outro] for gaps ≥ min_gap
      │
  ⑥ Validate timeline
      └─ Fix negatives, overlaps, enforce min duration, redistribute
      │
   SyncResult → JSON

Configuration

from lyric_sync import LyricSyncer, SyncConfig

config = SyncConfig(
    # Transcription
    use_openai             = True,
    use_elevenlabs         = True,
    openai_model           = "whisper-1",
    language               = "sv",          # ISO 639-1; Swedish default
    prefer_elevenlabs      = True,          # ElevenLabs timing = ground truth

    # Alignment
    match_min_confidence   = 0.55,          # minimum word-match ratio
    word_similarity_threshold = 0.70,       # fuzzy word similarity threshold

    # Gap detection
    min_gap_duration       = 1.5,           # seconds; shorter gaps ignored
    min_instrumental_duration = 1.0,

    # Validation
    fix_negative_durations = True,
    fix_overlaps           = True,
    redistribute_on_violation = True,
    min_segment_duration   = 0.5,
)

syncer = LyricSyncer(config=config)

Output format

video_project_final.json (compatible with MusicVideoCreator / CineForge):

{
  "version": "1.0",
  "song_name": "Conny",
  "audio_duration": 195.3,
  "language": "sv",
  "stats": {
    "segment_count": 28,
    "lyric_count": 22,
    "instrumental_count": 6,
    "interpolated_count": 2,
    "mean_confidence": 0.847,
    "redistributed_count": 0
  },
  "segments": [
    {
      "index": 0,
      "text": "[Intro]",
      "start_time": 0.0,
      "end_time": 3.13,
      "duration": 3.13,
      "has_lyrics": false,
      "segment_type": "intro",
      "confidence": 1.0,
      "is_interpolated": false
    },
    {
      "index": 1,
      "text": "Min handledare hette Conny han var rak som ett vattenpass",
      "start_time": 3.13,
      "end_time": 8.119,
      "duration": 4.989,
      "has_lyrics": true,
      "segment_type": "lyric",
      "confidence": 0.982,
      "is_interpolated": false
    }
  ]
}

Testing without API calls

from lyric_sync import LyricSyncer
from lyric_sync.models import TimedWord

words = [
    TimedWord(word="Min", start=3.13, end=3.5),
    TimedWord(word="handledare", start=3.6, end=4.2),
    TimedWord(word="hette", start=4.3, end=4.7),
    TimedWord(word="Conny", start=4.8, end=5.4),
]

lines = ["Min handledare hette Conny"]

result = LyricSyncer().sync_from_text(words, lines, audio_duration=60.0)
print(result.segments[0].start_time)   # 3.13
print(result.segments[0].end_time)     # 5.4

CLI

lyric-sync conny.mp3 --lyrics conny.txt --output output/conny/ --language sv
lyric-sync conny.mp3 --lyrics conny.txt --no-openai   # ElevenLabs only
lyric-sync conny.mp3 --lyrics conny.txt --min-gap 2.0 --verbose

Package structure

lyric_sync/
├── __init__.py                   ← LyricSyncer + re-exports
├── syncer.py                     ← Main pipeline orchestrator
├── models.py                     ← TimedWord, LyricSegment, SyncResult, SyncConfig
├── utils.py                      ← Swedish encoding fix, normalise, fuzzy match
├── cli.py                        ← lyric-sync CLI
├── transcriber/
│   ├── openai_transcriber.py     ← Whisper word timestamps
│   ├── elevenlabs_transcriber.py ← Scribe word timestamps
│   └── consensus.py              ← Merge two transcriptions
├── aligner/
│   ├── exclusion_pool.py         ← Core word-matching engine
│   └── aligner.py                ← Align lines + interpolation
├── detector/
│   └── gap_detector.py           ← Intro/Instrumental/Outro detection
├── validator/
│   └── timeline_validator.py     ← Fix negatives, overlaps, redistribute
└── exporter/
    └── json_exporter.py          ← video_project_final.json

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

May 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lyric_sync-1.0.0.tar.gz (23.0 kB view details)

Uploaded May 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lyric_sync-1.0.0-py3-none-any.whl (27.7 kB view details)

Uploaded May 15, 2026 Python 3

File details

Details for the file lyric_sync-1.0.0.tar.gz.

File metadata

Download URL: lyric_sync-1.0.0.tar.gz
Upload date: May 15, 2026
Size: 23.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for lyric_sync-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`7e0f2209583615c6a6db2a1ba21d0f36c5c85061fc04091e20ccaa76b45297fc`
MD5	`2475425d0ed2067b1229c7c796c594a9`
BLAKE2b-256	`917bdefe6ff7788ee4a31303e9b095125ba096ac5fa2e736f707abb117a72b3e`

See more details on using hashes here.

File details

Details for the file lyric_sync-1.0.0-py3-none-any.whl.

File metadata

Download URL: lyric_sync-1.0.0-py3-none-any.whl
Upload date: May 15, 2026
Size: 27.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for lyric_sync-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d5c163739f3622d9f201fdfaa06034783d475428962d9b949776731aa2d3f451`
MD5	`9db2ee919d1dee099410432532daaf69`
BLAKE2b-256	`ae674f3fe34072b7b2b1d958136e3f6e567067e9ee460f0097b4582766699957`

See more details on using hashes here.

lyric-sync 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

lyric-sync

What it solves

Installation

Quick start

The pipeline

Configuration

Output format

Testing without API calls

CLI

Package structure

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes