Skip to main content

Align kanji lyrics with romaji karaokes

Project description

kara-kanji-sync

From lyrics in japanese and an already timed karaoke in romaji Aegisub subtitles file (.ass), generate a new subtitle file timed in japanese.

Getting Started

Notebook

Open the notebook in Google Colab, save a copy and follow the instructions.

Open In Colab

Pypi

pip install kara-kanji-sync

Methodology

Algorithm

  1. The algorithm starts by trying to find the hiragana, katakana and english words from the original lyrics in the romaji lyrics from the already timed karaoke. It associates each group of kanji to a group of romaji syllables.
  2. To associate each kanji of the group to its appropriate syllables, the algorithm tries all pronunciations of all possible combination of kanji until it finds the right one. The pronunciation for individual kanji is from Jisho and for group is from JmdictFurigana.
  3. It recreates the line with punctuation and special characters from lyrics.

Caveats

  • Each lyric line has to be strictly aligned with the one from the ASS file.
  • Numbers are not treated, they may have to be replaced by kanji or modified manually in the result file.
  • Some words composed by multiple kanji followed by hiragana can be missed.

Cases where modifications may be needed on the input file :

  • Words in "japenglish" transcribed in romaji into english won't be recognised if kana lyrics transcribed the word in katakana. Exemple: "Asphalt" pronounced "ASUFARUTO"
  • Some karaoke timers put apostrophes on muted vowels. Those will cause errors during the first phase.
  • Unusual characters or punctuation signs may cause issues, removing them when a sync error is raised is recommended.

Recommended workflow

  1. Get the lyrics (preferably on a reliable website like Lyrical Nonsense).
  2. Get the ass file.

Using the notebook

Just the follow the instructions that globally in

  1. Installing the package.
  2. Uploading the file.
  3. Inputting the lyrics with an interface that shows lines from uploaded ass which facilitate this phase.
  4. Launching a lyrics check.
  5. Launching the main algorithm.
  6. Downloading the result.

Using the package

Here a code snippet to generate from lyrics and sub

import pysubs2
from kara_kanji_sync import KanjiSyncer

subs = pysubs2.load(f"path_to_ass_file.ass")
lyrics = open(f"path_to_lyrics.txt").read()

kanji_syncer = KanjiSyncer()
kanji_syncer.subtitles_file = subs
kanji_syncer.lyrics = lyrics.splitlines()

kanji_file = kanji_syncer.sync_subs("Kanji Top") # You can choose "Kanji Bottom" to have the subtitles on the bottom
kanji_file.save(f"result.ass")
print(kanji_syncer.errors) # Shows all the potential errors

In Aegisub, on the result file

  1. Load the ass file and the video.
  2. Modify the styles Kanji Top and Kanji Top - Right (or Kanji Bottom and Kanji Bottom - Right if you chose this option in the sync_sub function).
  3. Automation -> Apply karaoke template

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kara_kanji_sync-0.1.12.tar.gz (6.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kara_kanji_sync-0.1.12-py3-none-any.whl (5.0 MB view details)

Uploaded Python 3

File details

Details for the file kara_kanji_sync-0.1.12.tar.gz.

File metadata

  • Download URL: kara_kanji_sync-0.1.12.tar.gz
  • Upload date:
  • Size: 6.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.19

File hashes

Hashes for kara_kanji_sync-0.1.12.tar.gz
Algorithm Hash digest
SHA256 f7fae96bda11226b4b97c6185a961a816cf6c985e8125adea8a8db366b332585
MD5 aee8efb37ec34d6916424a16f657beee
BLAKE2b-256 b8c716970a1bd8250a6673aa0222f2b0445888569ad8a24a462b5b31582e89ec

See more details on using hashes here.

File details

Details for the file kara_kanji_sync-0.1.12-py3-none-any.whl.

File metadata

File hashes

Hashes for kara_kanji_sync-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 7f32d47bb7fb43cb79edcd7b1e27a4013e27f88900084f61f6ddb2584ca9aaa1
MD5 fbe9bb53f02777587b24ac94acdf9cb7
BLAKE2b-256 14678e013a57ab4c7080e3989aef95f310784fb939c1eeeaadf0d4295f23a900

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page