Skip to main content

Simple library and command line tool for loading flash cards into anki.

Project description

hanky: A CLI Application For Generating Anki Decks

Hatch project

Hanky is an extendable cli tool which loads flash cards from files, transforms them, then adds them to Anki.

For example, a single hanky pipeline might take an English word and:

  1. scrape a French translation and example sentence from a dictionary site, then
  2. generate spoken audio of that translation with a text-to-speech service,

turning a one-column spreadsheet into rich, audio-enabled cards.

Hanky is not affiliated or associated with the Anki application/org.

Table of Contents

Installation

pip install hanky

Quickstart

The idea is simple. You extend hanky in your own python script my_script.py by writing functions called card processors. These functions contain the logic to transform cards. Hanky then handles the cli interface, calling your processors, and finally adding the cards to anki.

To illustrate what this looks like in practice, we will create a simple hanky script which ensures all the text on a card is lower case.

See Card Processors & Pipelines or the demo folder for more complex, real world examples.

# my_script.py
from hanky import Hanky


# instantiate the hanky app
hanky = Hanky()


@hanky.card_processor("basic", expected_args=[], card_fields=[])
def lowercase_card(card: dict):
    """Lower-case the text on every field of the card."""
    return {field: value.lower() for field, value in card.items()}


# run the hanky cli application by running this python file, for example:
#   python3 my_script.py load basic words.csv -d english::vocab
hanky.run()

Running your hanky script

You run a hanky script like you would any other python script.

Given words.csv:

Front,Back
Serendipity,A Pleasant Surprise
Ephemeral,Lasting A VERY Short Time

We would run our new script like so

python3 my_script.py load basic words.csv --deck english::vocab

Here we tell hanky to load each line of the words.csv, transform each line of the csv with the lowercase_card card processor, before finally adding each card, of type basic, into the anki deck english::vocab.

This would leave us with a english::vocab deck containing the following cards:

Front Back
serendipity a pleasant surprise
ephemeral lasting a very short time

Note: hanky only adds cards, media, and decks. The Anki models/note types you load into must already exist in your collection — create them in Anki's UI first. See Adding a Note Type.

Configuration

There are two ways to configure Hanky; via a configuration file or via a configuration object.

You will only need configuration if you do not use the default Anki profile (User 1).

1. Via TOML file at ~/.config/hanky/hanky.toml** (loaded automatically if present):

# Path to the Anki collection (the sqlite db Anki stores cards in).
# Required only if you do NOT use the default profile, "User 1".
# Defaults:
#   Linux:  ~/.local/share/Anki2/User 1/collection.anki2
#   macOS:  ~/Library/Application Support/Anki2/User 1/collection.anki2
ANKI_DB_PATH = "~/.local/share/Anki2/User 1/collection.anki2"

# Check whether another process (e.g. Anki itself) is using the collection
# before opening it. Default: true. Disabling this risks database corruption.
DO_SAFETY_CHECK = true

# Allow cards whose fields duplicate an existing card. Default: false.
ALLOW_DUPLICATES = false

# Where hanky writes a backup of the collection before modifying it.
# Default: ~/.local/share/hanky/backups
BACKUP_FOLDER = "~/.local/share/hanky/backups"

2. A Config object passed to Hanky(...) in your script (takes precedence over the file). Useful if you want different config for different scripts:

from hanky import Hanky
from hanky.config import Config

hanky = Hanky(config=Config(ALLOW_DUPLICATES=True))

Card Processors & Pipelines

A card processor is a Python function that runs on each card before it is added to Anki. You could use one to enrich a card (fetch a translation, generate audio) or transform it (lower-case a field, render LaTeX).

Multiple processors can be registered on a Hanky app to create a pipeline.

The processor contract

A processor is registered with three things:

@hanky.card_processor(model, expected_args=[...], card_fields=[...])
def my_processor(card: dict, **expected_args):
    ...
Part Meaning
model The Anki model/note-type name. The processor runs on every card of this model.
card_fields Fields that must already be present on the card when this processor runs. Hanky checks this and raises a clear error if one is missing. It lets a processor declare what an earlier step must have produced.
expected_args Names of CLI arguments the processor needs. They are passed in from the command line via --args key=value and arrive as keyword arguments. For example, you might have the same pipeline for different languages, so you would pass in lang=german or lang=french

When hanky calls your processors, the first argument is always the card; a plain dict representing a a cards fields. Any declared expected_args are then passed in as key word arguments. So if you declared

@hanky.card_processor(model, expected_args=["lang"], card_fields=[...])
def my_processor(card: dict, lang):
    ...

Your processor could be called like:

my_processor(card, lang="german")

A processor must return one of:

  • card — the (modified) dict, when it adds no media;
  • (card, media) — a card plus a single CardMedia;
  • (card, [media, ...]) — a card plus a list of media.

Whatever changes a processor makes to a card become visible to every processor that runs after it.

Pipelines with multiple processors

If you register more than one processor they form a pipeline. They run in the order you registered them, each receiving the card in the state the previous one returned it.

This example below builds flash cards for learning french vocabulary as an english speaker. Given a spreadsheet whose only column is word:

Word
thanks
hello
...
goodbye

We want to creates flash cards with the french translation/definition, an example sentence (in both french and english), and french speech audio for the example sentence.

The two processors in our pipeline will be:

  1. scrape_translation: takes an english word and scrapes a website to produce a translation and an example.
  2. add_audio — takes the translation and example produced in step 1 then produces French audio.

Note that for brevity the scrape_wordreference and generate_neural_speech functions are not included. See the full demo file for the complete code.

from hanky import CardMedia, Hanky

def scrape_wordreference(word: str, lang_pair: str) -> tuple[str, str]:
    ...

def generate_neural_speech(utf_8_str: str, voice: str) -> bytes:
    ...

hanky = Hanky()

# Stage 1: requires `word` (this comes straight from the csv), produces `translation` + `example`.
@hanky.card_processor("lang-vocab", expected_args=[], card_fields=["word"])
def scrape_translation(card: dict):
    translation, example = scrape_wordreference(card["word"], "enfr")
    card["translation"] = translation
    card["example"] = example
    return card

# Stage 2: requires `translation` (from stage 1), attaches audio media.
@hanky.card_processor("lang-vocab", expected_args=[], card_fields=["translation"])
def add_audio(card: dict):
    speech = generate_neural_speech(card["translation"], voice="Lea")
    audio = CardMedia(speech, ".mp3")
    card["translation-audio"] = audio.media_ref
    return card, [audio]

hanky.run()

Then we would run the script like normal. python3 my_script.py load lang-vocab words.xlsx -d french::vocab

Note that we are no longer using a basic model. That means we are assuming that a model called lang-vocab has already been created in the anki ui. See the Anki documentation for adding a note type to learn how this is done.

Attaching media

Anki handles media seperately to flash cards, so Hanky does as well.

To add audio (or other media), you create CardMedia objects, add a reference to the media via .media_ref onto your card before finally returning the CardMedia object from the processor.

The CardMedia objects are created from raw bytes and a file extension CardMedia(audio_bytes, ".mp3")or from an existing file CardMedia.from_file("my_sound_file.mp3").

audio = CardMedia(mp3_bytes, ".mp3")
card["Speech"] = audio.media_ref
return card, [audio]

media_ref is how anki knows where the media should go, so it must be placed somewhere on the card. Hanky will make sure that the data will end up in Anki's media store.

Supported media is currently only audio: .mp3, .oga, .opus, .wav, .weba, .aac.

Custom file loaders

To load formats beyond CSV/JSON, register a loader against a file extension. A loader takes an open file object and yields one dict per card. Once registered, that extension works everywhere — load, load-dir, and through your pipelines.

import pandas

def excel_loader(f_obj):
    df = pandas.read_excel(f_obj)
    for _, row in df.iterrows():
        yield row.to_dict()

# is_text=False because .xlsx must be opened in binary mode
hanky.register_loader(".xlsx", excel_loader, is_text=False)

Loading from non-file sources

The CLI loads cards from files, but you can also build cards in your own script and add them by calling load_cards directly. It takes any iterable of dicts, one dict per card, so the source can be a list, a generator, rows from an API, or anything else.

import random

from hanky import Hanky

DICTIONARY = {
    "ephemeral": "lasting for a very short time",
    "serendipity": "the occurrence of happy events by chance",
    "petrichor": "the smell of rain falling on dry ground",
    "lucid": "expressed clearly and easy to understand",
    "gregarious": "fond of the company of others; sociable",
    # ...
}

hanky = Hanky()


def random_word_cards(n):
    for word in random.sample(list(DICTIONARY), n):
        yield {"Front": word, "Back": DICTIONARY[word]}


# add 20 cards straight from the generator, with no file involved
report = hanky.load_cards(random_word_cards(20), "basic", "english::vocab")
print(f"Added {report.added}, skipped {report.skipped}, failed {report.failed}.")

load_cards runs the same processor pipeline as the CLI and returns a LoadReport with counts of the cards that were added, skipped, and failed.

Examples

Complete, runnable scripts live in the demo/ directory, ordered here from simplest to most involved. Install their dependencies with pip install -r demo/requirements.txt.

  • demo_lowercase.py — The minimal example: a single, dependency-free processor that lower-cases every field on a card.
  • demo_random_words.py — A non-file source: builds cards from an in-script word list and adds them with load_cards.
  • demo_define.py — A single processor that fills the back of each card with a dictionary definition of the word on its front, looked up offline via WordNet (NLTK).
  • demo_audio.py — Registers a custom .xlsx loader and uses AWS Polly to attach text-to-speech audio, choosing the language from a CLI argument (--args lang=french).
  • demo_scrape.py — A two-stage English→French pipeline that scrapes a translation and example sentence from WordReference, then voices the translation with AWS Polly.
  • demo_example_sentences.py — A three-stage English→French pipeline that scrapes a translation from WordReference, asks Claude for an example sentence at a given CEFR level, then adds French audio for both the word and the sentence with AWS Polly.

CLI Reference

Both the hanky command and your own scripts share the same interface:

[hanky | python3 my_script.py] <operation> <model> <file|dir> [pattern] [options]

load — load cards from a single file:

hanky load [-h] [-d DECK] [--fail-fast] [--args K=V ...] model file
  model         Anki model/note-type name to create cards with.
  file          File to load from (.csv, .json, or a registered extension).
  -d, --deck    Destination deck. Defaults to the filename without extension.
  --fail-fast   Stop and raise on the first card that can't be added, instead
                of skipping it and reporting it at the end.
  --args        key=value args forwarded to your card processors (scripts only).

load-dir — load many files from a directory, deriving deck names from paths:

hanky load-dir [-h] [-r] [--fail-fast] [--args K=V ...] model dir pattern
  model            Anki model/note-type name to create cards with.
  dir              Directory to load from.
  pattern          Glob selecting files, e.g. "*.csv".
  -r, --recursive  Also descend into sub-directories.
  --fail-fast      Stop and raise on the first card that can't be added, instead
                   of skipping it and reporting it at the end.
  --args           key=value args forwarded to your card processors (scripts only).

For example, to load every CSV under french/ while building deck names from the folder structure:

hanky load-dir basic "french/" "*.csv" -r
french/                        decks created
├── animals.csv          ->    french::animals
├── bodies.csv           ->    french::bodies
└── grammar/
    └── passe_compose.csv ->   french::grammar::passe_compose

CSV (.csv) and JSON (.json) files work with no setup. The column/key names in your file must match the field names of the Anki model you target.

Development

uv sync                    # install dependencies (incl. dev + demo groups)
uv run pytest              # run tests (CI runs Python 3.11, 3.12, 3.13)
uv run ruff format .       # format
uv run ruff check .        # lint
uv run mypy src/hanky      # type check

Publishing

  1. Bump __version__ in src/hanky/__about__.py.

  2. Build the distributions:

    rm -rf dist && uv build
    
  3. Publish (needs a PyPI API token in UV_PUBLISH_TOKEN):

    uv publish
    
  4. Sanity check:

    uv run --with hanky --refresh-package hanky --no-project -- python -c "import hanky"
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hanky-0.1.1.tar.gz (108.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hanky-0.1.1-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file hanky-0.1.1.tar.gz.

File metadata

  • Download URL: hanky-0.1.1.tar.gz
  • Upload date:
  • Size: 108.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hanky-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4cc2139a0df6849442ae796f85036b76576ba0245fcb9910d01f06ea39309726
MD5 640a5b7f83e1a2fdcea1939d0f496a9a
BLAKE2b-256 492006a20c4c29cdea4c2c19fe976b3f8dc5ee4ea9c1a1225f3a2712b5c4ae1f

See more details on using hashes here.

File details

Details for the file hanky-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: hanky-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hanky-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5c328336693e74a12bd7d6c3d79e22667c53c4095ebdf2d4d5a2d50e16bef3d0
MD5 7b085e02aee3c98e05914123557295b9
BLAKE2b-256 f305416e4d7638983ae34593964ff8c5085e5b0971288408a6880abb97ae313f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page