Skip to main content

Image → natural language summary with heuristics, thumbnails, and LLM transport hints

Project description

img2nl

AI Cost Tracking

PyPI Version Python License AI Cost Human Time Model

  • 🤖 LLM usage: $1.1910 (4 commits)
  • 👤 Human dev: ~$350 (3.5h @ $100/h, 30min dedup)

Generated on 2026-06-09 using openrouter/qwen/qwen3-coder-next


Heuristic image → natural language summary for transport to LLM and other services.

No vision LLM required for the core path — uses layered heuristics (Pillow, optional OpenCV, perceptual hash, conditional OCR/QR/YOLO).

Features

Layer Module Extra What it detects
0 colors, dynamics, noise, objects, patterns [analyze] palette, contrast, flat regions, UI blocks
1 edges [opencv] blur, edge density, text likelihood
2 fingerprint, similarity [similarity] pHash/dHash/wHash, screen match
3 special_hits [scan] / [ocr] QR/barcode, OCR (conditional)
4 semantic_hits [detect] YOLO labels (opt-in)

Output includes features.scene.scene_class, llm_hint, and optional similarity when reference_fingerprint is passed.

Full architecture: docs/detection-pipeline.md

Install

pip install -e ".[analyze]"              # core (Pillow + NumPy)
pip install -e ".[full]"                 # analyze + opencv + similarity + scan
pip install -e ".[analyze,translate]"    # + argostranslate offline
bash install-dev.sh                        # full *2img2nl stack

Optional extras: opencv, similarity, scan, ocr, detect — see pyproject.toml.

CLI

img2nl analyze photo.png --json
img2nl analyze photo.png --locale de --translate-mode offline
dsl2img2nl -c "ANALYZE photo.png" --json
uri2img2nl query "img2nl://analyze?path=photo.png&locale=pl"
python -c "from img2nl.i18n import supported_locales; print(supported_locales())"

Offline translation (argostranslate)

Static catalog covers 38 European langs; for scalable updates use neural offline translation:

pip install img2nl[analyze,translate]
img2nl translate-install en pl
img2nl analyze photo.png --locale de --translate-mode offline

Modes: auto (catalog pl/en, else argos), offline (require argos), catalog (JSON only).

Python API

from img2nl import analyze_image

result = analyze_image("screen.png", skip_thumbnail=True)
print(result.text, result.features["scene"]["scene_class"], result.llm_hint)

# compare with previous capture
prev = analyze_image("screen_a.png", skip_thumbnail=True)
cur = analyze_image(
    "screen_b.png",
    skip_thumbnail=True,
    reference_fingerprint=prev.features["fingerprint"],
    enable_detect=False,  # True → YOLO (heavy)
)
print(cur.features.get("similarity", {}))

Packages

Package Role
img2nl Core heuristics + describe + thumbnail + layered detection
uri2img2nl img2nl:// URI layer
dsl2img2nl DSL bus (ANALYZE, QUERY, LLM_HINT)
cli2img2nl Shell adapter

See packages/README.md.

VQL integration

Layered pipeline with img2vql in oqlos/vql:

pip install -e ".[analyze,similarity,opencv,scan]"
pip install -e ~/github/oqlos/vql/packages/img2vql

# adopt → metadata (fingerprint, special_hits, scene_class)
uri2vql analyze-window --image capture.png --out app.vql.json

# smart skip when screen unchanged
uri2vql analyze-window --image capture.png --out app.vql.json

uri2vql refresh-window --vql-program app.vql.json --image capture.png
uri2vql compare-window --vql-program app.vql.json --image capture.png
img2vql diagnose capture.png --vql-program app.vql.json --save
uri2vql resolve "odśwież metadata vql" --file app.vql.json --image capture.png

# end-to-end demo
bash ~/github/oqlos/vql/examples/img2nl-vql-flow.sh capture.png app.vql.json

Docs

Doc Content
docs/detection-pipeline.md Warstwy 0–4, schema JSON, VQL cache
CHANGELOG.md Historia zmian
TODO.md Backlog

License

Licensed under Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2nl-0.1.5.tar.gz (885.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

img2nl-0.1.5-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file img2nl-0.1.5.tar.gz.

File metadata

  • Download URL: img2nl-0.1.5.tar.gz
  • Upload date:
  • Size: 885.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.5.tar.gz
Algorithm Hash digest
SHA256 8007d0c1c7e43aad173428e0175e5de16a2a4305b963f376dc82f91bacc16595
MD5 e65448b5eaa8a12a0b53516ae5e6adc3
BLAKE2b-256 7b8c34e201889d4d33abd3598ec534a69ef5c67d781860005c1e193a1315ea4d

See more details on using hashes here.

File details

Details for the file img2nl-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: img2nl-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 69b9f356913955e85773c2db22a23dc38f47d8c8df2193f697bfa184a23ee4e7
MD5 233f691cea7740f00fb0bc4371ce4bd7
BLAKE2b-256 74d4a98c65d1ae1a32a2135a89c3d3f9ff40be9c5c2f11a805a9f7191d9d55e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page