Skip to main content

Automatically make Anki Decks for Chinese text

Project description

autoanki

Tool for generating Chinese flashcards for Anki

PyPI - Version GitHub Actions Workflow Status

About

When learning Chinese, some common advice is to learn the top X most common words. This is good advice, as you can get pretty far with this, however it's not perfect.

For example, Harry Potter. This book will have normal distribution for most words, however there will be a heavy emphasis on a specialized subset of words such as Wand, Robe, Wizard, Broomstick etc. These words will show up a lot more than they would otherwise.

The intention of this package was to allow Chinese learners to move from beginner books to more advanced material. I found there was a gap in knowledge going from beginner learning books (where there is little specalized terminology), to teen novels, where each novel will generally have its own specialized terminology, making the transition tedious. This is solved by automatically making Anki decks that have this specialized terminology, so that you are able to memorize these words while continuing to make progress

With autoanki, you selectively add words to an Anki file to continue progressing with your lanuage learning skills.

Usage

To get started, run pip install autoanki This should install all the requirements. Then, in a Python file, do from autoanki import AutoAnki

To get started, create an autoanki instance with the 2-letter code of the language you want to use

aa = AutoAnki('zh')

Opitonally, include a path to a database file you want to use:

db_path = "AutoAnki.db"
if not AutoAnki.is_database(db_path):
    AutoAnki.create_database(db_path)

Add whatever books you want in your deck. These can be a single file, or a string

bookpath = 'short-story.txt'
aa.add_book_from_string("...", 'My first book🍎')
aa.add_book_from_string(bookpath, 'My first book🍎')

Once all of your books are added, the definitions need to be found, and then you can create a deck!

aa.complete_unfinished_definitions()
aa.create_deck("AutoAnki Deck", "output")

This will automatically have the .apkg extension, which Anki uses. Import this file into Anki, and you're all set.

Other commands

If you want to see the information of a database, use:

aa.print_database_info()

If you would like to create and use your own dictionary, you can pass it in:

aa = AutoAnki(db_path, dictionary=CustomDictionary())

This dictionary must implement functions from the abstract class autoanki/Dictionary.py

Some settings can be set regarding how cards will be formatted, and what will be shown. They can be seen here:

aa.deck_settings(
include_traditional=True,
include_part_of_speech=True,
word_frequency_filter=1e-05 # Filters using this library: https://pypi.org/project/wordfreq/
)

The filter is the percentage of words less frequent: 的 shows up 6% of the time in text, so putting a value of 7 will omit it

How it works

AutoAnki interfaces has 4 components on the back end:

  1. DatabaseManager: Takes the cleaned input and puts it into the database
  2. Dictionary: Finds definitions for words in the database
  3. DeckManager: Creates Decks

Dictionary

This is an abstract class that can be implemented with the following methods

  • __init__(debug_level)
  • find_word(word) - Returns None, or a list of paramaters that match the input of DatabaseManager.update_definition()
  • size() - Number of entries in the dictionary

There is one dictionary included as the default: an endpoint to CC-CEDICT. I have local versions of other dictionaries with copyrighted data, which I can not upload.

Database

There are 3 different types of tables in the DB:

  • dictionary contains a information about each word, including the pinyin, traditional characters, and a definition
  • book_list contains the book name, table name, and language for each book added
  • book contains the book table id, dictionary word id, and the number of appearances for each word in the book
Dictionary table Book list table Book table

Planned features

  • See ROADMAP.md

Other Info

If you would like to get involved, or learn more information, reading Anki documentation is really important, especially the Getting Started

To get definitions, this autoanki uses the CC-CEDICT under the creative commons licence.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoanki-1.2.0.tar.gz (3.9 MB view details)

Uploaded Source

Built Distribution

autoanki-1.2.0-py3-none-any.whl (3.9 MB view details)

Uploaded Python 3

File details

Details for the file autoanki-1.2.0.tar.gz.

File metadata

  • Download URL: autoanki-1.2.0.tar.gz
  • Upload date:
  • Size: 3.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for autoanki-1.2.0.tar.gz
Algorithm Hash digest
SHA256 574a0edb3740caa2ebbdb84dd2ce70c932e7831630c26f5cf639a96b9246ea0a
MD5 527cd846aaa15c17a4f3d46893d9ff0c
BLAKE2b-256 cfea1816027101b075b7813800538eb9dc8efdc8428fd52fe49c70340320c91d

See more details on using hashes here.

File details

Details for the file autoanki-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: autoanki-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 3.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for autoanki-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 88a987db1ca80a0c7c619723ad611c4c6d06d1316459fbe9c282fe39208212e8
MD5 00b8d82631cd754621ab6c112aa2a839
BLAKE2b-256 78407298bab40cf9c822b830991d7f7daa545cc31bd8d5cb09859d51777b6e14

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page