Skip to main content

Image → natural language summary with heuristics, thumbnails, and LLM transport hints

Project description

img2nl

AI Cost Tracking

PyPI Version Python License AI Cost Human Time Model

  • 🤖 LLM usage: $0.8469 (2 commits)
  • 👤 Human dev: ~$300 (3.0h @ $100/h, 30min dedup)

Generated on 2026-06-09 using openrouter/qwen/qwen3-coder-next


Heuristic image → natural language summary for transport to LLM and other services.

No vision LLM required for the core path — uses layered heuristics (Pillow, optional OpenCV, perceptual hash, conditional OCR/QR/YOLO).

Features

Layer Module Extra What it detects
0 colors, dynamics, noise, objects, patterns [analyze] palette, contrast, flat regions, UI blocks
1 edges [opencv] blur, edge density, text likelihood
2 fingerprint, similarity [similarity] pHash/dHash/wHash, screen match
3 special_hits [scan] / [ocr] QR/barcode, OCR (conditional)
4 semantic_hits [detect] YOLO labels (opt-in)

Output includes features.scene.scene_class, llm_hint, and optional similarity when reference_fingerprint is passed.

Full architecture: docs/detection-pipeline.md

Install

pip install -e ".[analyze]"              # core (Pillow + NumPy)
pip install -e ".[full]"                 # analyze + opencv + similarity + scan
pip install -e ".[analyze,translate]"    # + argostranslate offline
bash install-dev.sh                        # full *2img2nl stack

Optional extras: opencv, similarity, scan, ocr, detect — see pyproject.toml.

CLI

img2nl analyze photo.png --json
img2nl analyze photo.png --locale de --translate-mode offline
dsl2img2nl -c "ANALYZE photo.png" --json
uri2img2nl query "img2nl://analyze?path=photo.png&locale=pl"
python -c "from img2nl.i18n import supported_locales; print(supported_locales())"

Offline translation (argostranslate)

Static catalog covers 38 European langs; for scalable updates use neural offline translation:

pip install img2nl[analyze,translate]
img2nl translate-install en pl
img2nl analyze photo.png --locale de --translate-mode offline

Modes: auto (catalog pl/en, else argos), offline (require argos), catalog (JSON only).

Python API

from img2nl import analyze_image

result = analyze_image("screen.png", skip_thumbnail=True)
print(result.text, result.features["scene"]["scene_class"], result.llm_hint)

# compare with previous capture
prev = analyze_image("screen_a.png", skip_thumbnail=True)
cur = analyze_image(
    "screen_b.png",
    skip_thumbnail=True,
    reference_fingerprint=prev.features["fingerprint"],
    enable_detect=False,  # True → YOLO (heavy)
)
print(cur.features.get("similarity", {}))

Packages

Package Role
img2nl Core heuristics + describe + thumbnail + layered detection
uri2img2nl img2nl:// URI layer
dsl2img2nl DSL bus (ANALYZE, QUERY, LLM_HINT)
cli2img2nl Shell adapter

See packages/README.md.

VQL integration

Layered pipeline with img2vql in oqlos/vql:

pip install -e ".[analyze,similarity,opencv,scan]"
pip install -e ~/github/oqlos/vql/packages/img2vql

# adopt → metadata (fingerprint, special_hits, scene_class)
uri2vql analyze-window --image capture.png --out app.vql.json

# smart skip when screen unchanged
uri2vql analyze-window --image capture.png --out app.vql.json

uri2vql refresh-window --vql-program app.vql.json --image capture.png
uri2vql compare-window --vql-program app.vql.json --image capture.png
img2vql diagnose capture.png --vql-program app.vql.json --save
uri2vql resolve "odśwież metadata vql" --file app.vql.json --image capture.png

# end-to-end demo
bash ~/github/oqlos/vql/examples/img2nl-vql-flow.sh capture.png app.vql.json

Docs

Doc Content
docs/detection-pipeline.md Warstwy 0–4, schema JSON, VQL cache
CHANGELOG.md Historia zmian
TODO.md Backlog

License

Licensed under Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2nl-0.1.3.tar.gz (783.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

img2nl-0.1.3-py3-none-any.whl (39.3 kB view details)

Uploaded Python 3

File details

Details for the file img2nl-0.1.3.tar.gz.

File metadata

  • Download URL: img2nl-0.1.3.tar.gz
  • Upload date:
  • Size: 783.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.3.tar.gz
Algorithm Hash digest
SHA256 c71f5e0613acb9afbbc295155a7818b767b5960b2670fc4bd53dd1dd52363de4
MD5 c08ca3bed9b77c70495e084e415295b1
BLAKE2b-256 ee125781dbac3f67de288f1af8f09ef5f53a8a30bd631c4cd093a6f7cccb7a7d

See more details on using hashes here.

File details

Details for the file img2nl-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: img2nl-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 39.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4ad12549e74d9331a2e7d2050838eafff18bba50fc86caeaf5df0ce63b56910a
MD5 47f0b2b5f9a0c9b5761b29c67633ff24
BLAKE2b-256 24b3e67fb503927a9e65ca9fdf7a4093477ab0601820dbdf091639b433e4b095

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page