NIOBIUM: Nadia's Image Occlusion Booster Is UnManned - A CLI tool for extracting text and image-occlusion-style notes from images and PDFs for Anki
Project description
NIOBIUM: Nadia's Image Occlusion Booster Is UnManned
NIOBIUM is a small CLI tool for extracting text and image-occlusion-style notes from images and PDFs, and for preparing Anki-compatible outputs (via AnkiConnect or by creating an .apkg). This README shows common usages and examples for the command-line interface.
Trivia: What the actual heck is niobium?
Niobium is a stealthy gray metal that is absurdly strong, feather‑light and allergic to corrosion. Mostly mined in Brazil and Canada, it moonlights in super‑alloys for jet engines and superconducting MRI magnets. It even hides in the capacitors inside your phone and laptop.
So next time you're on a flight, fiddling with your phone on the way to an MRI conference, tip your hat to niobium, OR just give this repo a ⭐️.
Installation
Using pip
pip install nb41
Using uv (faster alternative)
uv pip install nb41
From source
git clone https://github.com/agahkarakuzu/niobium.git
cd niobium
pip install -e .
Requirements
- Python 3.8 or higher
- All dependencies are automatically installed with the package
Quick overview
The main entry point is the niobium command. It exposes a few mutually-exclusive input modes and a few mutually-exclusive output modes.
Inputs (one required):
-i, --image— absolute path to a single image file-dir, --directory— directory containing multiple images-pin, --single-pdf— absolute path to a single PDF
Outputs (one required):
-deck, --deck-name— name of the Anki deck where notes will be pushed (requires AnkiConnect)-pout, --pdf-img-out— output directory where images extracted from a PDF will be saved-apkg, --apkg-out— output directory where a generated.apkgwill be saved
Other useful flags:
-ioid, --io-model-id— ID of the built-in Image Occlusion model in Anki (optional, used with--apkg-out)-m, --merge-rects— whether to merge nearby detected rectangles (default: True)-mx, --merge-lim-x— horizontal merging threshold in pixels (default: 10)-my, --merge-lim-y— vertical merging threshold in pixels (default: 10)-l, --langs— comma-separated OCR languages (default:en)-g, --gpu— GPU index to use, or-1for CPU only (default: -1)-hdr, --add-header— add filename as a header (default: False)-basic, --basic-type— create basic Anki cards instead of image-occlusion notes (default: False)-c, --config— path to a custom config file (see Configuration below)--smart— use Claude Vision to intelligently filter OCR results (see Niobium Smart below)
Run niobium -h to see the help text with the current arguments.
Configuration
Niobium uses a JSON config file to control how OCR results are filtered before creating Anki notes. Without any configuration, a sensible bundled default is used automatically.
Getting started
Generate your own config file:
niobium --init-config
This copies the default template to ~/.config/niobium/config.json. To open it in your editor:
niobium --edit-config
This creates the config if it doesn't exist yet, then opens it with $EDITOR (falls back to vi).
Niobium will tell you which config file it is using every time it runs.
Config resolution order
--config path/to/config.json— explicit path passed via CLI (highest priority)~/.config/niobium/config.json— user-level config- Bundled default inside the package (lowest priority)
Config file format
{
"langs": "en",
"gpu": -1,
"merge": {
"enabled": true,
"limit_x": 10,
"limit_y": 10
},
"exclude": {
"exact": ["A", "B", "Reproductive system"],
"regex": ["(Figure|Fig\\.|Fig\\:)\\s+(\\d+[-\\w]*).*"]
},
"extra": [
{"Ductus deferens": "Ductus deferens is a.k.a <span style=\"color:red;\">Vas deferens</span>"}
],
"llm": {
"api_key": null,
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"temperature": 0.2,
"instructions": null
}
}
| Key | What it does |
|---|---|
langs |
Comma-separated OCR languages (default: "en"). E.g. "en,fr" for English and French. |
gpu |
GPU index to use for OCR, or -1 for CPU only (default: -1). |
merge.enabled |
Whether to merge nearby OCR bounding boxes before creating occlusions (default: true). |
merge.limit_x |
Horizontal merge threshold in pixels — boxes closer than this are merged (default: 10). |
merge.limit_y |
Vertical merge threshold in pixels — boxes closer than this are merged (default: 10). |
exclude.exact |
OCR text matching any of these strings (case-insensitive) is discarded and won't become an occlusion. Useful for filtering out labels like "A", "B", or section headings that appear in images. |
exclude.regex |
OCR text matching any of these regular expressions is discarded. Useful for filtering out figure captions (e.g., "Figure 1", "Fig. 2a"). |
extra |
A list of key-value objects. When OCR detects text matching a key (case-insensitive), the corresponding value is appended to the note's "Back Extra" field as HTML. Useful for adding supplementary information to specific terms. |
llm.api_key |
Anthropic API key for Smart mode. If null, falls back to ANTHROPIC_API_KEY env var. See Niobium Smart. |
llm.model |
Claude model to use (default: claude-sonnet-4-6). |
llm.max_tokens |
Maximum response length from Claude (default: 1024). |
llm.temperature |
Response variability — lower is more consistent (default: 0.2). |
llm.instructions |
Custom instructions appended to the built-in prompt to steer Smart mode for specific disciplines. |
CLI flags (--langs, --gpu, --merge-rects, --merge-lim-x, --merge-lim-y) override the config file values when provided.
Examples
Below are some concrete example commands (assumes you're in the project root and using zsh/bash):
- ⭐️ Run OCR and push image-occlusion notes to an Anki deck (via AnkiConnect)
This processes all images under a directory and pushes notes to the Anki deck named MyStudyDeck.
niobium --directory /absolute/path/to/images --deck-name MyStudyDeck
Notes:
- You may specify a deck name that doesn't yet exist; you'll be prompted to create it.
- Anki must be running with the AnkiConnect add-on enabled.
- The tool will detect text and create image-occlusion notes from detected regions.
- Extract images from a single PDF
This extracts embedded images from lecture.pdf into ./out_images.
niobium --single-pdf /absolute/path/to/lecture.pdf --pdf-img-out /absolute/path/to/out_images
Important: --single-pdf is required when using --pdf-img-out.
- Produce an
.apkgfile (offline export, no AnkiConnect needed)
This processes a directory and writes an .apkg bundle suitable for import into Anki without requiring AnkiConnect at runtime. Uses genanki under the hood.
niobium --directory /absolute/path/to/images --apkg-out /absolute/path/to/output_dir
You can also use a single image or a PDF as input:
niobium --image /absolute/path/to/image.png --apkg-out /absolute/path/to/output_dir
niobium --single-pdf /absolute/path/to/lecture.pdf --apkg-out /absolute/path/to/output_dir
- Create basic (front/back) Anki cards instead of image-occlusion notes
niobium --directory /absolute/path/to/images --deck-name MyStudyDeck --basic-type True
This comes in handy when you have a bunch of images in a folder (may be extracted from a PDF, see (2) above), and would like to create Q&A for each one of them.
- Tweak rectangle merging and OCR languages
If bounding boxes are too fragmented, increase the merge thresholds. To OCR multiple languages, provide a comma-separated list.
niobium --directory /absolute/path/to/images --deck-name MyStudyDeck --merge-lim-x 20 --merge-lim-y 20 --langs en,fr
Note: Rectangle merging and other heuristics are experimental. Nearby occlusion boxes may be merged unintentionally, or distinct boxes may remain separate. Adjust --merge-lim-x/--merge-lim-y or disable merging with --merge-rects to change the behavior.
If you come up with a more robust approach to this, feel free to send a PR!
- GPU usage
Pass --gpu 0 to attempt to use GPU 0. The default -1 runs on CPU.
niobium --directory /abs/path/to/images --deck-name MyStudyDeck --gpu 0
Niobium Smart
Niobium Smart uses Claude's vision capabilities to intelligently decide what to occlude in your images. Instead of relying on manual exclude rules, Claude analyzes each image and makes semantic decisions about what's worth studying.
What it does
When you pass the --smart flag, the pipeline becomes:
Image → OCR (detects all text + bounding boxes) → Claude Vision (curates results) → Anki card
Claude sees the full image alongside the OCR-detected text and:
- Decides what to occlude — key terms, anatomical labels, drug names, disease names, important values
- Decides what to skip — figure labels (A, B, Fig. 1), publisher info, copyright notices, page numbers, OCR noise
- Corrects OCR errors — compares garbled OCR text against what it actually sees in the image (e.g., "Glcmerulus" → "Glomerulus")
- Generates study hints — clinical correlations, functional notes, alternative names, mnemonics — added to the Back Extra field
- Describes the image — a one-line context description appears at the top of Back Extra
OCR still handles precise bounding box coordinates (its strength), Claude handles semantic understanding (its strength).
Usage
Add --smart to any existing niobium command:
# Single image
niobium --image /path/to/anatomy.png --apkg-out ./output --smart
# Directory of images
niobium --directory /path/to/slides --deck-name Pharmacology --smart
# PDF
niobium --single-pdf /path/to/lecture.pdf --apkg-out ./output --smart
Without --smart, niobium works exactly as before — pure OCR with rule-based filtering.
API key
Niobium Smart requires an Anthropic API key. You can provide it in two ways:
-
Environment variable (recommended for security):
export ANTHROPIC_API_KEY=sk-ant-...
-
Config file (convenient for personal use):
"llm": { "api_key": "sk-ant-..." }
Config takes priority over the environment variable. If no key is found, niobium falls back to rule-based filtering with a warning.
Configuration
The llm section in your config file controls Smart mode behavior:
{
"llm": {
"api_key": null,
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"temperature": 0.2,
"instructions": null
}
}
| Key | What it does |
|---|---|
api_key |
Anthropic API key. If null, falls back to ANTHROPIC_API_KEY environment variable. |
model |
Claude model to use (default: claude-sonnet-4-6). |
max_tokens |
Maximum response length from Claude (default: 1024). |
temperature |
Controls response variability. Lower = more consistent, higher = more creative hints (default: 0.2). |
instructions |
Custom instructions appended to the built-in prompt. This is where you tailor Smart mode to your study goals (see below). |
Custom instructions
The instructions field is the most powerful configuration option. It lets you steer Claude's decisions for a specific study context without replacing the underlying logic. Set it in your config file:
"instructions": "Focus on pharmacology. Occlude drug names, mechanisms of action, and side effects. Skip anatomical terms unless they are drug targets."
Examples for different disciplines:
Pharmacology:
"instructions": "I'm studying pharmacology. Prioritize drug names, drug classes, mechanisms of action, receptor types, and side effects. Add hints about drug interactions and clinical indications."
Histology:
"instructions": "These are histology slides. Occlude tissue types, cell types, staining characteristics, and structural features. Add hints about how to distinguish similar-looking tissues."
Pathology:
"instructions": "Focus on pathological findings. Occlude disease names, morphological descriptions, and diagnostic features. Add hints about epidemiology and clinical presentation."
Step 1/USMLE prep:
"instructions": "I'm preparing for USMLE Step 1. Add high-yield clinical correlations and First Aid-style memory aids in the hints."
Text-heavy slides:
"instructions": "These images contain mostly text paragraphs. Occlude only the most important medical terms, numerical values, and key facts. Skip filler words and context sentences."
Set instructions to null (or remove it) to use the default general-purpose behavior.
Cost
Claude Sonnet processes each image for roughly $0.005–$0.01 depending on image size and number of text regions. A batch of 50 images costs approximately $0.25–$0.50.
Fallback behavior
If anything goes wrong during a Smart mode run (API error, network timeout, malformed response), niobium automatically falls back to rule-based filtering for that image and continues processing. You always get your cards.
Common workflows
- Automatic creation of image-occlusion notes and push to Anki:
--directory+--deck-name(Anki must be running with anki-connect installed)--single-pdf+--deck-name(Anki must be running with anki-connect installed)
- Quick extraction from a PDF for manual review:
--single-pdf+--pdf-img-out
Troubleshooting
- If AnkiConnect calls fail, confirm Anki is running and AnkiConnect is installed and enabled.
- If OCR quality is poor, try adding the proper language code with
--langs(e.g.,en,es) and ensure Tesseract language packs are installed. - If many small boxes are produced, increase
--merge-lim-x/--merge-lim-yor set--merge-rects Falseto disable merging.
Development
Setting up for development
git clone https://github.com/agahkarakuzu/niobium.git
cd niobium
pip install -e .
Running tests
The package includes automated tests that run on each push via GitHub Actions. You can test locally:
# Test the CLI is available
niobium -h
# Test import
python -c "from niobium.cli import main; print('Import successful')"
Project structure
niobium/cli.py- Main CLI entry point with argument parsingniobium/io.py- Core I/O helpers and OCR functionalitypyproject.toml- Package configuration and dependencies
Contributing
If you'd like to contribute, open an issue or submit a pull request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nb41-0.3.0.tar.gz.
File metadata
- Download URL: nb41-0.3.0.tar.gz
- Upload date:
- Size: 96.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
983b74fc0c36b7a80337c3762edfa5fb45558991ae153df5e17d1788a3f367cc
|
|
| MD5 |
9491187cb80f54882ca3c65c2c8594be
|
|
| BLAKE2b-256 |
c2e87fa4af7c24040e2175f40b97119830889b0010c6f76dabbef330c25a8cd8
|
Provenance
The following attestation bundles were made for nb41-0.3.0.tar.gz:
Publisher:
publish-pypi.yml on agahkarakuzu/niobium
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nb41-0.3.0.tar.gz -
Subject digest:
983b74fc0c36b7a80337c3762edfa5fb45558991ae153df5e17d1788a3f367cc - Sigstore transparency entry: 975924659
- Sigstore integration time:
-
Permalink:
agahkarakuzu/niobium@da09653193e898e90e193e383576e8adc988d94c -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/agahkarakuzu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@da09653193e898e90e193e383576e8adc988d94c -
Trigger Event:
release
-
Statement type:
File details
Details for the file nb41-0.3.0-py3-none-any.whl.
File metadata
- Download URL: nb41-0.3.0-py3-none-any.whl
- Upload date:
- Size: 34.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d36af70e340c244a876a12e21806a3e6331162c56e31289a61c51f67debc0021
|
|
| MD5 |
5a2939ca6c9951bcd7280bf51bb6399b
|
|
| BLAKE2b-256 |
d66cde0cce0d3a385bc4edef512c9e257e831eb65b9ddf4858bec8e43bab6839
|
Provenance
The following attestation bundles were made for nb41-0.3.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on agahkarakuzu/niobium
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nb41-0.3.0-py3-none-any.whl -
Subject digest:
d36af70e340c244a876a12e21806a3e6331162c56e31289a61c51f67debc0021 - Sigstore transparency entry: 975924663
- Sigstore integration time:
-
Permalink:
agahkarakuzu/niobium@da09653193e898e90e193e383576e8adc988d94c -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/agahkarakuzu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@da09653193e898e90e193e383576e8adc988d94c -
Trigger Event:
release
-
Statement type: