Skip to main content

extracts kanji sentences using ocr for automatic anki flashcards

Project description

⛶ Japtoanki ⛶

Automated pipeline to turn Japanese text from files or images into high-quality anki-flashcards

Extracts sentences containing kanji from manga, screenshots, etc, with mokuro and automatically creates Anki flashcards for later study.

Uses Mokuro OCR to find text from images, and MeCab (Fugashi) to filter bad grammar and readings.


Features:

  • Morphological analysis (MeCab / Fugashi) to discard OCR halucinations and junk.

  • Generates standard furigana reading.

  • Option to translate using Google translate API.

  • Users can provide a .txt, .json, .md file containing kanji they have already mastered. Sentences containing only mastered kanji are filtered.

  • Automatically links every unmastered kanji to the Hochanh RTK Guide.

  • Pushes cards directly to Anki when Anki-connect is enabled, otherwise generates .csv decks to manually import into anki.

  • Easy file navigation.

Installation

pip install japtoanki

Requirements:

Usage

Command Line Interface

Running japtoanki opens a file navigator.

japtoanki

Highly recommended to use with Anki-connect plugin in Anki. Anki needs to be open as japtoanki runs.

japtoanki /path/to/directory --deck Kanji_Sentences --tag manga --no-furigana --translate en --mastered-kanji /path/to/file

run activate-global-python-argcomplete --user in the terminal so that flags autocomplete when TAB is pressed.

Flags:

--deck Name of the deck in Anki you would like to store the generated cards. If the deck doesn't exist, it will be created.

--tags Tag each generated card. By default every card is tagged "japtoanki"

--translate Translate each sentence into desired language using Google-translate API (Default is en (english))

--no-furigana Furigana is generated for kanji by default. Use this flag to disable.

--mastered-kanji Provide a document containing kanji you have already mastered to update the known set. Can be manually edited at (~/.mastered_kanji_list(japtoanki).txt)


Note Model

Japtoanki creates a custom model in Anki with the following fields:

Note Model

Japtoanki creates a custom Anki note type with the following fields:

~ Front: Japanese sentence with kanji (e.g.,きょうはいいてんきですね)

~ Back: Same sentence with furigana readings above kanji (e.g., 今日[きょう]はいい天気[てんき]ですね)

~ Use --no-furigana to show plain kanji

~ HochanhLinks: Clickable kanji links to RTK memorization guides (only shows unmastered kanji)

~ Translation: Optional automatic translation via Google Translate (--translate en)

Examples

Cli file navigation

FileNavigator

Anki cards:

Front:

one

Back:

two

Without any flags:

Anki Card Anki Card2 Anki Card3 Anki card4

When --translate and --no-furigana flags are used:

Anki Card5 Anki Card6

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

japtoanki-0.1.3.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

japtoanki-0.1.3-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file japtoanki-0.1.3.tar.gz.

File metadata

  • Download URL: japtoanki-0.1.3.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for japtoanki-0.1.3.tar.gz
Algorithm Hash digest
SHA256 372553c315207980e8c96a52bd17c2e9547942ae503066d9c90aee87db95cc0b
MD5 e97e712433c77f4cfc5b28ee3f3ef264
BLAKE2b-256 3bc5af61425f5dd3a79f932e5a0ed1779853605bcf3b1ac08549cdc475e825cd

See more details on using hashes here.

File details

Details for the file japtoanki-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: japtoanki-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for japtoanki-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b3c22a190768de6de6e754b43f0faeadb2189bf502d543e878a011ed543206db
MD5 7763b43686129b2737176f185a1d57e1
BLAKE2b-256 c296bd97d6043c16fa35da6994dc3556c561972cefc1d24956b86e5dda6c4f01

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page