Skip to main content

Image → natural language summary with heuristics, thumbnails, and LLM transport hints

Project description

img2nl

AI Cost Tracking

PyPI Version Python License AI Cost Human Time Model

  • 🤖 LLM usage: $1.3073 (5 commits)
  • 👤 Human dev: ~$350 (3.5h @ $100/h, 30min dedup)

Generated on 2026-06-09 using openrouter/qwen/qwen3-coder-next


Heuristic image → natural language summary for transport to LLM and other services.

No vision LLM required for the core path — uses layered heuristics (Pillow, optional OpenCV, perceptual hash, conditional OCR/QR/YOLO).

Features

Layer Module Extra What it detects
0 colors, dynamics, noise, objects, patterns [analyze] palette, contrast, flat regions, UI blocks
1 edges [opencv] blur, edge density, text likelihood
2 fingerprint, similarity [similarity] pHash/dHash/wHash, screen match
3 special_hits [scan] / [ocr] QR/barcode, OCR (conditional)
4 semantic_hits [detect] YOLO labels (opt-in)

Output includes features.scene.scene_class, llm_hint, and optional similarity when reference_fingerprint is passed.

Full architecture: docs/detection-pipeline.md

Install

pip install -e ".[analyze]"              # core (Pillow + NumPy)
pip install -e ".[full]"                 # analyze + opencv + similarity + scan
pip install -e ".[analyze,translate]"    # + argostranslate offline
bash install-dev.sh                        # full *2img2nl stack

Optional extras: opencv, similarity, scan, ocr, detect — see pyproject.toml.

CLI

img2nl analyze photo.png --json
img2nl analyze photo.png --locale de --translate-mode offline
dsl2img2nl -c "ANALYZE photo.png" --json
uri2img2nl query "img2nl://analyze?path=photo.png&locale=pl"
python -c "from img2nl.i18n import supported_locales; print(supported_locales())"

Offline translation (argostranslate)

Static catalog covers 38 European langs; for scalable updates use neural offline translation:

pip install img2nl[analyze,translate]
img2nl translate-install en pl
img2nl analyze photo.png --locale de --translate-mode offline

Modes: auto (catalog pl/en, else argos), offline (require argos), catalog (JSON only).

Python API

from img2nl import analyze_image

result = analyze_image("screen.png", skip_thumbnail=True)
print(result.text, result.features["scene"]["scene_class"], result.llm_hint)

# compare with previous capture
prev = analyze_image("screen_a.png", skip_thumbnail=True)
cur = analyze_image(
    "screen_b.png",
    skip_thumbnail=True,
    reference_fingerprint=prev.features["fingerprint"],
    enable_detect=False,  # True → YOLO (heavy)
)
print(cur.features.get("similarity", {}))

Packages

Package Role
img2nl Core heuristics + describe + thumbnail + layered detection
uri2img2nl img2nl:// URI layer
dsl2img2nl DSL bus (ANALYZE, QUERY, LLM_HINT)
cli2img2nl Shell adapter

See packages/README.md.

VQL integration

Layered pipeline with img2vql in oqlos/vql:

pip install -e ".[analyze,similarity,opencv,scan]"
pip install -e ~/github/oqlos/vql/packages/img2vql

# adopt → metadata (fingerprint, special_hits, scene_class)
uri2vql analyze-window --image capture.png --out app.vql.json

# smart skip when screen unchanged
uri2vql analyze-window --image capture.png --out app.vql.json

uri2vql refresh-window --vql-program app.vql.json --image capture.png
uri2vql compare-window --vql-program app.vql.json --image capture.png
img2vql diagnose capture.png --vql-program app.vql.json --save
uri2vql resolve "odśwież metadata vql" --file app.vql.json --image capture.png

# end-to-end demo
bash ~/github/oqlos/vql/examples/img2nl-vql-flow.sh capture.png app.vql.json

Docs

Doc Content
docs/detection-pipeline.md Warstwy 0–4, schema JSON, VQL cache
CHANGELOG.md Historia zmian
TODO.md Backlog

License

Licensed under Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2nl-0.1.6.tar.gz (708.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

img2nl-0.1.6-py3-none-any.whl (53.7 kB view details)

Uploaded Python 3

File details

Details for the file img2nl-0.1.6.tar.gz.

File metadata

  • Download URL: img2nl-0.1.6.tar.gz
  • Upload date:
  • Size: 708.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.6.tar.gz
Algorithm Hash digest
SHA256 2afc3d92ac3c7572e9c948b19bd3bad5744829971d5559c797cac7135fad0bda
MD5 c866e6893e4cb047eddb538449266074
BLAKE2b-256 c3900ddeac280f2683497001f6098a688202ca5ece1b3a5c92d078e681122281

See more details on using hashes here.

File details

Details for the file img2nl-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: img2nl-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 53.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 963f084affe1af53f129bdff0f2f4aa985026816ac81db65d02b0b3b7c70f34f
MD5 47fe565e3a08c52947b3074b76cac7ef
BLAKE2b-256 e9a85f5b25f110b1a26d651ead9b62e954f99c9b3e40f17ff04f610972153e7a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page