A language learning utility with Anki integration
Project description
Ankipan
Ankipan is a project to democratize language learning in a decentralized way.
It allows you to choose which domains you want to be more fluent in, and creates a custom learning curriculum that aims to get you to your goal as effectively and efficiently as possible.
Workflow
First, choose which fields you want on your flashcards: which online dictionaries for definitions, which example sentence sources (e.g. YouTube subtitles, particular youtubers, Wikipedia, open news corpora etc.), or other fields such as statistically frequent contexts of the word or gpt-explained comparisons to synonyms/similar words.
Then parse any text you like (plain text, subtitles, PDF, HTML, or Ankipan DB sources). Ankipan generates a frequency-sorted list of lemmas, lets you pick which words to learn, and creates the corresponding Anki flashcards that can be directly synced with anki using the AnkiConnect extension.
Inside Anki, you can color-tag useful example sentences. The cards will remember and automatically expand those on the next review, so you can try to recall actual sentences you would use instead of just a translation when seeing the word on the frontside of the flashcard.
Example sentences include GPT-generated translations and explanations that are generated by the ankipan_default server hosted in germany by default. If it is currently too busy or you just prefer a local solution, you can use your own Google Gemini API key which has a free quota of 1000 requests per day, or a local Ollama setup (see ankipan/gpt_base.py).
You can use this tool in the long term to track which words you already know, and filter new sources you are interested in by the most relevant new words so that you can quickly get started and focus your progress on the areas that you are personally interested in. In the long term, we aim to move beyond generating flashcards for just singular words and provide users with a more holistic way to engage with a learning curriculum via full sentences and other useful tasks that are optimally adapted to their individual learning style and language goals.
Getting started
1. Prerequisites
- Download and install anki from https://apps.ankiweb.net/
- Create an account on their website
- Install the AnkiConnect plugin
- Open the app and login, keep anki open when syncing databases
2. Installation
- Using pip:
pip install ankipan
- From source:
git clone git@gitlab.com:ankipan/ankipan.git
cd ankipan
pip install .
3. (Optional) Install lemmatizers to parse your own texts
- Download pytorch for stanza lemma parsing
- install dependencies:
pip install stanza
pip install HanTa # optional but recommended for german, allows for more accurate lemmatization
Usage
See notebooks in /examples.
Development
The goal of this project is to create a library that is highly modular and scalable, as well as flexible enough to adapt to the needs of any particular language or language learning style. New flashcard fields can easily be added in a modular way by creating a new file in the ankipan/flashcard_fields directory.
We try to be as decentralized as possible by allowing anyone to privately or publicly host a server with text data that might be interesting for language learners or provide resources to generate and cache translations and explanations with GPTs.
Users can connect to any number of servers, the default list can be found on /servers.yaml and you can also add custom servers or your own servers with ankipan.Config.add_server('custom_server_name', url) or ankipan.Config.add_server('my_local_server', 'http://127.0.0.1:5701') (default local address when you launch ankipan_db/server.py on the same computer).
To initialize ankipan_db as a submodule, paste the following commands into CLI:
git submodule update --init
cd ankipan_db
git fetch --unshallow || true
git config remote.origin.fetch "+refs/heads/*:refs/remotes/origin/*"
git fetch origin
cd .. && git submodule update --remote ankipan_db
If you are interested in having your own public server added to the standard servers.yaml list, feel free to just create an issue.
Notes
- Current lemmatization is done via the
stanzalibrary in the reader.py module. While this works mostly fine, the library still just uses a statistical model to estimate the likely word roots (lemmas) of the different pieces of sentences. It sometimes makes mistakes or produces lemmas which make no sense and requires the users to manually filter them in theselect_new_wordsoverview, or suspend the card later on in anki.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ankipan-0.10.tar.gz.
File metadata
- Download URL: ankipan-0.10.tar.gz
- Upload date:
- Size: 270.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58a8839ca77d696e0c0a2f927a4b87135172b12e215d6200714f107c1961249e
|
|
| MD5 |
86e845b06dd6adf613d92fa8f6508ea2
|
|
| BLAKE2b-256 |
e040f1e6a74d66a57fd1406abe9ec942c27de852a5c5043abda785485b87a86e
|
File details
Details for the file ankipan-0.10-py3-none-any.whl.
File metadata
- Download URL: ankipan-0.10-py3-none-any.whl
- Upload date:
- Size: 301.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00a303535983dfd122e7b74364d448e01a21b909f78414746762c6898d94964a
|
|
| MD5 |
3ce22a99eb3d3ff2ff6dc5111ac45f72
|
|
| BLAKE2b-256 |
873987cffd08d911c1170a20d69949af811b5cf3360571183ac9fb51251e8c42
|