Skip to main content

NIOBIUM: Nadia's Image Occlusion Booster Is UnManned - A CLI tool for extracting text and image-occlusion-style notes from images and PDFs for Anki

Project description

PyPI - Version

NIOBIUM: Nadia's Image Occlusion Booster Is UnManned

NIOBIUM is a small CLI tool for extracting text and image-occlusion-style notes from images and PDFs, and for preparing Anki-compatible outputs (via AnkiConnect or by creating an .apkg). This README shows common usages and examples for the command-line interface.

Trivia: What the actual heck is niobium?

Niobium is a stealthy gray metal that is absurdly strong, feather‑light and allergic to corrosion. Mostly mined in Brazil and Canada, it moonlights in super‑alloys for jet engines and superconducting MRI magnets. It even hides in the capacitors inside your phone and laptop.

So next time you're on a flight, fiddling with your phone on the way to an MRI conference, tip your hat to niobium, OR just give this repo a ⭐️.

Installation

Using pip

pip install nb41

Using uv (faster alternative)

uv pip install nb41

From source

git clone https://github.com/agahkarakuzu/niobium.git
cd niobium
pip install -e .

Requirements

  • Python 3.8 or higher
  • All dependencies are automatically installed with the package

Quick overview

The main entry point is the niobium command. It exposes a few mutually-exclusive input modes and a few mutually-exclusive output modes.

Inputs (one required):

  • -i, --image — absolute path to a single image file
  • -dir, --directory — directory containing multiple images
  • -pin, --single-pdf — absolute path to a single PDF

Outputs (one required):

  • -deck, --deck-name — name of the Anki deck where notes will be pushed (requires AnkiConnect)
  • -pout, --pdf-img-out — output directory where images extracted from a PDF will be saved
  • -apkg, --apkg-out — output directory where a generated .apkg will be saved

Other useful flags:

  • -ioid, --io-model-id — ID of the built-in Image Occlusion model in Anki (optional, used with --apkg-out)
  • -m, --merge-rects — whether to merge nearby detected rectangles (default: True)
  • -mx, --merge-lim-x — horizontal merging threshold in pixels (default: 10)
  • -my, --merge-lim-y — vertical merging threshold in pixels (default: 10)
  • -l, --langs — comma-separated OCR languages (default: en)
  • -g, --gpu — GPU index to use, or -1 for CPU only (default: -1)
  • -hdr, --add-header — add filename as a header (default: False)
  • -basic, --basic-type — create basic Anki cards instead of image-occlusion notes (default: False)
  • -c, --config — path to a custom config file (see Configuration below)
  • --smart — use Claude Vision to intelligently filter OCR results (see Niobium Smart below)

Run niobium -h to see the help text with the current arguments.

Configuration

Niobium uses a JSON config file to control how OCR results are filtered before creating Anki notes. Without any configuration, a sensible bundled default is used automatically.

Getting started

Generate your own config file:

niobium --init-config

This copies the default template to ~/.config/niobium/config.json. To open it in your editor:

niobium --edit-config

This creates the config if it doesn't exist yet, then opens it with $EDITOR (falls back to vi).

Niobium will tell you which config file it is using every time it runs.

Config resolution order

  1. --config path/to/config.json — explicit path passed via CLI (highest priority)
  2. ~/.config/niobium/config.json — user-level config
  3. Bundled default inside the package (lowest priority)

Config file format

{
    "langs": "en",
    "gpu": -1,
    "merge": {
        "enabled": true,
        "limit_x": 10,
        "limit_y": 10
    },
    "exclude": {
        "exact": ["A", "B", "Reproductive system"],
        "regex": ["(Figure|Fig\\.|Fig\\:)\\s+(\\d+[-\\w]*).*"]
    },
    "extra": [
        {"Ductus deferens": "Ductus deferens is a.k.a <span style=\"color:red;\">Vas deferens</span>"}
    ],
    "llm": {
        "api_key": null,
        "model": "claude-sonnet-4-6",
        "max_tokens": 1024,
        "temperature": 0.2,
        "instructions": null
    }
}
Key What it does
langs Comma-separated OCR languages (default: "en"). E.g. "en,fr" for English and French.
gpu GPU index to use for OCR, or -1 for CPU only (default: -1).
merge.enabled Whether to merge nearby OCR bounding boxes before creating occlusions (default: true).
merge.limit_x Horizontal merge threshold in pixels — boxes closer than this are merged (default: 10).
merge.limit_y Vertical merge threshold in pixels — boxes closer than this are merged (default: 10).
exclude.exact OCR text matching any of these strings (case-insensitive) is discarded and won't become an occlusion. Useful for filtering out labels like "A", "B", or section headings that appear in images.
exclude.regex OCR text matching any of these regular expressions is discarded. Useful for filtering out figure captions (e.g., "Figure 1", "Fig. 2a").
extra A list of key-value objects. When OCR detects text matching a key (case-insensitive), the corresponding value is appended to the note's "Back Extra" field as HTML. Useful for adding supplementary information to specific terms.
llm.api_key Anthropic API key for Smart mode. If null, falls back to ANTHROPIC_API_KEY env var. See Niobium Smart.
llm.model Claude model to use (default: claude-sonnet-4-6).
llm.max_tokens Maximum response length from Claude (default: 1024).
llm.temperature Response variability — lower is more consistent (default: 0.2).
llm.instructions Custom instructions appended to the built-in prompt to steer Smart mode for specific disciplines.

CLI flags (--langs, --gpu, --merge-rects, --merge-lim-x, --merge-lim-y) override the config file values when provided.

Examples

Below are some concrete example commands (assumes you're in the project root and using zsh/bash):

  1. ⭐️ Run OCR and push image-occlusion notes to an Anki deck (via AnkiConnect)

This processes all images under a directory and pushes notes to the Anki deck named MyStudyDeck.

niobium --directory /absolute/path/to/images --deck-name MyStudyDeck

Notes:

  • You may specify a deck name that doesn't yet exist; you'll be prompted to create it.
  • Anki must be running with the AnkiConnect add-on enabled.
  • The tool will detect text and create image-occlusion notes from detected regions.
  1. Extract images from a single PDF

This extracts embedded images from lecture.pdf into ./out_images.

niobium --single-pdf /absolute/path/to/lecture.pdf --pdf-img-out /absolute/path/to/out_images

Important: --single-pdf is required when using --pdf-img-out.

  1. Produce an .apkg file (offline export, no AnkiConnect needed)

This processes a directory and writes an .apkg bundle suitable for import into Anki without requiring AnkiConnect at runtime. Uses genanki under the hood.

niobium --directory /absolute/path/to/images --apkg-out /absolute/path/to/output_dir

You can also use a single image or a PDF as input:

niobium --image /absolute/path/to/image.png --apkg-out /absolute/path/to/output_dir
niobium --single-pdf /absolute/path/to/lecture.pdf --apkg-out /absolute/path/to/output_dir
  1. Create basic (front/back) Anki cards instead of image-occlusion notes
niobium --directory /absolute/path/to/images --deck-name MyStudyDeck --basic-type True

This comes in handy when you have a bunch of images in a folder (may be extracted from a PDF, see (2) above), and would like to create Q&A for each one of them.

  1. Tweak rectangle merging and OCR languages

If bounding boxes are too fragmented, increase the merge thresholds. To OCR multiple languages, provide a comma-separated list.

niobium --directory /absolute/path/to/images --deck-name MyStudyDeck --merge-lim-x 20 --merge-lim-y 20 --langs en,fr

Note: Rectangle merging and other heuristics are experimental. Nearby occlusion boxes may be merged unintentionally, or distinct boxes may remain separate. Adjust --merge-lim-x/--merge-lim-y or disable merging with --merge-rects to change the behavior.

If you come up with a more robust approach to this, feel free to send a PR!

  1. GPU usage

Pass --gpu 0 to attempt to use GPU 0. The default -1 runs on CPU.

niobium --directory /abs/path/to/images --deck-name MyStudyDeck --gpu 0

Niobium Smart

Niobium Smart uses Claude's vision capabilities to intelligently decide what to occlude in your images. Instead of relying on manual exclude rules, Claude analyzes each image and makes semantic decisions about what's worth studying.

What it does

When you pass the --smart flag, the pipeline becomes:

Image → OCR (detects all text + bounding boxes) → Claude Vision (curates results) → Anki card

Claude sees the full image alongside the OCR-detected text and:

  • Decides what to occlude — key terms, anatomical labels, drug names, disease names, important values
  • Decides what to skip — figure labels (A, B, Fig. 1), publisher info, copyright notices, page numbers, OCR noise
  • Corrects OCR errors — compares garbled OCR text against what it actually sees in the image (e.g., "Glcmerulus" → "Glomerulus")
  • Generates study hints — clinical correlations, functional notes, alternative names, mnemonics — added to the Back Extra field
  • Describes the image — a one-line context description appears at the top of Back Extra

OCR still handles precise bounding box coordinates (its strength), Claude handles semantic understanding (its strength).

Usage

Add --smart to any existing niobium command:

# Single image
niobium --image /path/to/anatomy.png --apkg-out ./output --smart

# Directory of images
niobium --directory /path/to/slides --deck-name Pharmacology --smart

# PDF
niobium --single-pdf /path/to/lecture.pdf --apkg-out ./output --smart

Without --smart, niobium works exactly as before — pure OCR with rule-based filtering.

API key

Niobium Smart requires an Anthropic API key. You can provide it in two ways:

  1. Environment variable (recommended for security):

    export ANTHROPIC_API_KEY=sk-ant-...
    
  2. Config file (convenient for personal use):

    "llm": {
        "api_key": "sk-ant-..."
    }
    

Config takes priority over the environment variable. If no key is found, niobium falls back to rule-based filtering with a warning.

Configuration

The llm section in your config file controls Smart mode behavior:

{
    "llm": {
        "api_key": null,
        "model": "claude-sonnet-4-6",
        "max_tokens": 1024,
        "temperature": 0.2,
        "instructions": null
    }
}
Key What it does
api_key Anthropic API key. If null, falls back to ANTHROPIC_API_KEY environment variable.
model Claude model to use (default: claude-sonnet-4-6).
max_tokens Maximum response length from Claude (default: 1024).
temperature Controls response variability. Lower = more consistent, higher = more creative hints (default: 0.2).
instructions Custom instructions appended to the built-in prompt. This is where you tailor Smart mode to your study goals (see below).

Custom instructions

The instructions field is the most powerful configuration option. It lets you steer Claude's decisions for a specific study context without replacing the underlying logic. Set it in your config file:

"instructions": "Focus on pharmacology. Occlude drug names, mechanisms of action, and side effects. Skip anatomical terms unless they are drug targets."

Examples for different disciplines:

Pharmacology:

"instructions": "I'm studying pharmacology. Prioritize drug names, drug classes, mechanisms of action, receptor types, and side effects. Add hints about drug interactions and clinical indications."

Histology:

"instructions": "These are histology slides. Occlude tissue types, cell types, staining characteristics, and structural features. Add hints about how to distinguish similar-looking tissues."

Pathology:

"instructions": "Focus on pathological findings. Occlude disease names, morphological descriptions, and diagnostic features. Add hints about epidemiology and clinical presentation."

Step 1/USMLE prep:

"instructions": "I'm preparing for USMLE Step 1. Add high-yield clinical correlations and First Aid-style memory aids in the hints."

Text-heavy slides:

"instructions": "These images contain mostly text paragraphs. Occlude only the most important medical terms, numerical values, and key facts. Skip filler words and context sentences."

Set instructions to null (or remove it) to use the default general-purpose behavior.

Cost

Claude Sonnet processes each image for roughly $0.005–$0.01 depending on image size and number of text regions. A batch of 50 images costs approximately $0.25–$0.50.

Fallback behavior

If anything goes wrong during a Smart mode run (API error, network timeout, malformed response), niobium automatically falls back to rule-based filtering for that image and continues processing. You always get your cards.

Common workflows

  • Automatic creation of image-occlusion notes and push to Anki:
    • --directory + --deck-name (Anki must be running with anki-connect installed)
    • --single-pdf + --deck-name (Anki must be running with anki-connect installed)
  • Quick extraction from a PDF for manual review:
    • --single-pdf + --pdf-img-out

Troubleshooting

  • If AnkiConnect calls fail, confirm Anki is running and AnkiConnect is installed and enabled.
  • If OCR quality is poor, try adding the proper language code with --langs (e.g., en,es) and ensure Tesseract language packs are installed.
  • If many small boxes are produced, increase --merge-lim-x/--merge-lim-y or set --merge-rects False to disable merging.

Development

Setting up for development

git clone https://github.com/agahkarakuzu/niobium.git
cd niobium
pip install -e .

Running tests

The package includes automated tests that run on each push via GitHub Actions. You can test locally:

# Test the CLI is available
niobium -h

# Test import
python -c "from niobium.cli import main; print('Import successful')"

Project structure

  • niobium/cli.py - Main CLI entry point with argument parsing
  • niobium/io.py - Core I/O helpers and OCR functionality
  • pyproject.toml - Package configuration and dependencies

Contributing

If you'd like to contribute, open an issue or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nb41-0.3.2.tar.gz (113.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nb41-0.3.2-py3-none-any.whl (48.9 kB view details)

Uploaded Python 3

File details

Details for the file nb41-0.3.2.tar.gz.

File metadata

  • Download URL: nb41-0.3.2.tar.gz
  • Upload date:
  • Size: 113.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nb41-0.3.2.tar.gz
Algorithm Hash digest
SHA256 03fe58b2892d03370b0765fbd23be1c24eb0904749dead06c96477bf9eb890dc
MD5 0e3b5607ccc8c998a6560c271f439268
BLAKE2b-256 c6d1919d930787ef8663b8dca2ac42d29732501c640be1b8855b85c90e7e5d98

See more details on using hashes here.

Provenance

The following attestation bundles were made for nb41-0.3.2.tar.gz:

Publisher: publish-pypi.yml on agahkarakuzu/niobium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nb41-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: nb41-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 48.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nb41-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 79cd25121ddf184648ea9ab476bd71163f340bb61f0d8be7a9a53d52856ac2e3
MD5 bd8a5bbbf5213a04cae12fdcd0495179
BLAKE2b-256 e45aa099cb1d08d588a2b9cb10fb96fedbde36575299eaefb625d37dfa83b000

See more details on using hashes here.

Provenance

The following attestation bundles were made for nb41-0.3.2-py3-none-any.whl:

Publisher: publish-pypi.yml on agahkarakuzu/niobium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page