Skip to main content

Image → natural language summary with heuristics, thumbnails, and LLM transport hints

Project description

img2nl

AI Cost Tracking

PyPI Version Python License AI Cost Human Time Model

  • 🤖 LLM usage: $0.6951 (1 commits)
  • 👤 Human dev: ~$200 (2.0h @ $100/h, 30min dedup)

Generated on 2026-06-09 using openrouter/qwen/qwen3-coder-next


Heuristic image → natural language summary for transport to LLM and other services.

No vision LLM required for the core path — uses layered heuristics (Pillow, optional OpenCV, perceptual hash, conditional OCR/QR/YOLO).

Features

Layer Module Extra What it detects
0 colors, dynamics, noise, objects, patterns [analyze] palette, contrast, flat regions, UI blocks
1 edges [opencv] blur, edge density, text likelihood
2 fingerprint, similarity [similarity] pHash/dHash/wHash, screen match
3 special_hits [scan] / [ocr] QR/barcode, OCR (conditional)
4 semantic_hits [detect] YOLO labels (opt-in)

Output includes features.scene.scene_class, llm_hint, and optional similarity when reference_fingerprint is passed.

Full architecture: docs/detection-pipeline.md

Install

pip install -e ".[analyze]"              # core (Pillow + NumPy)
pip install -e ".[full]"                 # analyze + opencv + similarity + scan
pip install -e ".[analyze,translate]"    # + argostranslate offline
bash install-dev.sh                        # full *2img2nl stack

Optional extras: opencv, similarity, scan, ocr, detect — see pyproject.toml.

CLI

img2nl analyze photo.png --json
img2nl analyze photo.png --locale de --translate-mode offline
dsl2img2nl -c "ANALYZE photo.png" --json
uri2img2nl query "img2nl://analyze?path=photo.png&locale=pl"
python -c "from img2nl.i18n import supported_locales; print(supported_locales())"

Offline translation (argostranslate)

Static catalog covers 38 European langs; for scalable updates use neural offline translation:

pip install img2nl[analyze,translate]
img2nl translate-install en pl
img2nl analyze photo.png --locale de --translate-mode offline

Modes: auto (catalog pl/en, else argos), offline (require argos), catalog (JSON only).

Python API

from img2nl import analyze_image

result = analyze_image("screen.png", skip_thumbnail=True)
print(result.text, result.features["scene"]["scene_class"], result.llm_hint)

# compare with previous capture
prev = analyze_image("screen_a.png", skip_thumbnail=True)
cur = analyze_image(
    "screen_b.png",
    skip_thumbnail=True,
    reference_fingerprint=prev.features["fingerprint"],
    enable_detect=False,  # True → YOLO (heavy)
)
print(cur.features.get("similarity", {}))

Packages

Package Role
img2nl Core heuristics + describe + thumbnail + layered detection
uri2img2nl img2nl:// URI layer
dsl2img2nl DSL bus (ANALYZE, QUERY, LLM_HINT)
cli2img2nl Shell adapter

See packages/README.md.

VQL integration

Layered pipeline with img2vql in oqlos/vql:

pip install -e ".[analyze,similarity,opencv,scan]"
pip install -e ~/github/oqlos/vql/packages/img2vql

# adopt → metadata (fingerprint, special_hits, scene_class)
uri2vql analyze-window --image capture.png --out app.vql.json

# smart skip when screen unchanged
uri2vql analyze-window --image capture.png --out app.vql.json

uri2vql refresh-window --vql-program app.vql.json --image capture.png
uri2vql compare-window --vql-program app.vql.json --image capture.png
img2vql diagnose capture.png --vql-program app.vql.json --save
uri2vql resolve "odśwież metadata vql" --file app.vql.json --image capture.png

# end-to-end demo
bash ~/github/oqlos/vql/examples/img2nl-vql-flow.sh capture.png app.vql.json

Docs

Doc Content
docs/detection-pipeline.md Warstwy 0–4, schema JSON, VQL cache
CHANGELOG.md Historia zmian
TODO.md Backlog

License

Licensed under Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2nl-0.1.2.tar.gz (435.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

img2nl-0.1.2-py3-none-any.whl (39.1 kB view details)

Uploaded Python 3

File details

Details for the file img2nl-0.1.2.tar.gz.

File metadata

  • Download URL: img2nl-0.1.2.tar.gz
  • Upload date:
  • Size: 435.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.2.tar.gz
Algorithm Hash digest
SHA256 14011a78046cb5ff616c6b4b05730099b1bca27dfcb60fdd3eded86171b8b5ce
MD5 5028a2e1a3efa82d8b740b15c2b99fbb
BLAKE2b-256 3af821d6f3397ad03f4cc96d0e979da5501bec530452806af620f9785a680c6e

See more details on using hashes here.

File details

Details for the file img2nl-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: img2nl-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 39.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 96db6a41765f034aac442e01c6f54e8ab789c0e055de7ec323a6275ab2bf0e93
MD5 3950931571708a8bb54e4ce64535993d
BLAKE2b-256 ce4830e616cfe156ddec34df405f3b3c55c7d3299577086c5ea3f821cfa3e72d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page