Skip to main content

Novel Translation utility using Sugoi Translator

Project description

Docs Unit Test codecov PyPI version

TenshiTranslator

Sugoi Toolkit's Sugoi Translator is very effective for ACG (Anime, Comit, Games) media translation as the model is trained with data from the same medium. However, the project lacks automation support as all the features require manual control, which makes large file translation incredibly daunting. This project implements automation utility that interfaces with the translator to both automate the translation process and increase the translation accuracy. This project has since then been adopted by over 10 novel series to generate preliminary machine translations for new novels.

project

Getting Started

You can install the project using pip install TenshiTranslator
For more information, visit the documentation here

Translator Options

Online Translator

This translator automates Sugoi Toolkit's web translator with zero extra setup required. However, it is both the slowest and the least accurate due to an older model, api limits, and a character limit.
online

Offline Translator

This translator uses Sugoi Toolkit’s offline translation server to perform translations. This translator requires Sugoi Toolkit but is faster than the online translator. It is also more accurate as it uses a newer model and has no character limits. The speed of this translator is dependent on your computer’s hardware, and is generally recommended if you don’t have an Nvidia GPU.

Batch Translator

This translator uses Sugoi Toolkit’s offline translation server to perform translations. Files are translated in batches, optimizating translation time by maximizing GPU utilization. This translator requires Sugoi Toolkit and a Nvidia GPU to be useful, but is magnitudes faster than the other translators. You will have to install CUDA and run the setup script to allow the sugoi toolkit to accept batch translation requests. This translator is recommended if you have an Nvidia GPU.

Features

Multiple format support

The translator offers two output formats: english only where the original stucture of the file is preserved, and line by line where each line of Japanese is followed by its translation, accelerating translation checking speed. The package also provide abstraction over outputs so you are free to implement your own formats.

english
Example English only format

lbl
Example line by line format

High level glossary

You can specify translations for specific phrases and also apply corrections to the translated text to improve translation accuracy. This is commonly used for names and other jargons that may not be translated correctly.

image image
Example replacement & correction with regex

Requirements

To run the program, you need Python >= 3.10
To use the offline and batch translator, you need Windows, and download Sugoi Toolkit from here
To use the batch translator, you need a computer with a Nvidia GPU and CUDA

Benchmarks

benchmark
Benchmark is done by measuring the time taken to translate 125 lines. Benchmark is run with an Intel i7-13700k and a Nvidia RTX 3060ti 8G

Credits

CUDA Installation script is adapted from the work by Tenerezza from the Sugoi Toolkit Discord
Batch translation is first implemented by @EagleEye17, who also gave a lot of suggestions to the overall project

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tenshitranslator-1.0.3.tar.gz (34.0 kB view hashes)

Uploaded Source

Built Distribution

tenshitranslator-1.0.3-py3-none-any.whl (39.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page