Skip to main content

Image → natural language summary with heuristics, thumbnails, and LLM transport hints

Project description

img2nl

AI Cost Tracking

PyPI Version Python License AI Cost Human Time Model

  • 🤖 LLM usage: $1.9425 (7 commits)
  • 👤 Human dev: ~$387 (3.9h @ $100/h, 30min dedup)

Generated on 2026-06-09 using openrouter/qwen/qwen3-coder-next


Heuristic image → natural language summary for transport to LLM and other services.

No vision LLM required for the core path — uses layered heuristics (Pillow, optional OpenCV, perceptual hash, conditional OCR/QR/YOLO).

Features

Layer Module Extra What it detects
0 colors, dynamics, noise, objects, patterns [analyze] palette, contrast, flat regions, UI blocks
1 edges [opencv] blur, edge density, text likelihood
2 fingerprint, similarity [similarity] pHash/dHash/wHash, screen match
3 special_hits [scan] / [ocr] QR/barcode, OCR (conditional)
4 semantic_hits [detect] YOLO labels (opt-in)

Output includes features.scene.scene_class, llm_hint, and optional similarity when reference_fingerprint is passed.

Full architecture: docs/detection-pipeline.md

Install

pip install -e ".[analyze]"              # core (Pillow + NumPy)
pip install -e ".[full]"                 # analyze + opencv + similarity + scan
pip install -e ".[analyze,translate]"    # + argostranslate offline
bash install-dev.sh                        # full *2img2nl stack

Optional extras: opencv, similarity, scan, ocr, detect — see pyproject.toml.

CLI

img2nl analyze photo.png --json
img2nl analyze photo.png --locale de --translate-mode offline
dsl2img2nl -c "ANALYZE photo.png" --json
uri2img2nl query "img2nl://analyze?path=photo.png&locale=pl"
python -c "from img2nl.i18n import supported_locales; print(supported_locales())"

Offline translation (argostranslate)

Static catalog covers 38 European langs; for scalable updates use neural offline translation:

pip install img2nl[analyze,translate]
img2nl translate-install en pl
img2nl analyze photo.png --locale de --translate-mode offline

Modes: auto (catalog pl/en, else argos), offline (require argos), catalog (JSON only).

Python API

from img2nl import analyze_image

result = analyze_image("screen.png", skip_thumbnail=True)
print(result.text, result.features["scene"]["scene_class"], result.llm_hint)

# compare with previous capture
prev = analyze_image("screen_a.png", skip_thumbnail=True)
cur = analyze_image(
    "screen_b.png",
    skip_thumbnail=True,
    reference_fingerprint=prev.features["fingerprint"],
    enable_detect=False,  # True → YOLO (heavy)
)
print(cur.features.get("similarity", {}))

Packages

Package Role
img2nl Core heuristics + describe + thumbnail + layered detection
uri2img2nl img2nl:// URI layer
dsl2img2nl DSL bus (ANALYZE, QUERY, LLM_HINT)
cli2img2nl Shell adapter

See packages/README.md.

VQL integration

Layered pipeline with img2vql in oqlos/vql:

pip install -e ".[analyze,similarity,opencv,scan]"
pip install -e ~/github/oqlos/vql/packages/img2vql

# adopt → metadata (fingerprint, special_hits, scene_class)
uri2vql analyze-window --image capture.png --out app.vql.json

# smart skip when screen unchanged
uri2vql analyze-window --image capture.png --out app.vql.json

uri2vql refresh-window --vql-program app.vql.json --image capture.png
uri2vql compare-window --vql-program app.vql.json --image capture.png
img2vql diagnose capture.png --vql-program app.vql.json --save
uri2vql resolve "odśwież metadata vql" --file app.vql.json --image capture.png

# end-to-end demo
bash ~/github/oqlos/vql/examples/img2nl-vql-flow.sh capture.png app.vql.json

Docs

Doc Content
docs/detection-pipeline.md Warstwy 0–4, schema JSON, VQL cache
CHANGELOG.md Historia zmian
TODO.md Backlog

License

Licensed under Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2nl-0.1.8.tar.gz (717.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

img2nl-0.1.8-py3-none-any.whl (56.1 kB view details)

Uploaded Python 3

File details

Details for the file img2nl-0.1.8.tar.gz.

File metadata

  • Download URL: img2nl-0.1.8.tar.gz
  • Upload date:
  • Size: 717.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.8.tar.gz
Algorithm Hash digest
SHA256 3259f9b9c311c06b745faea9def0545eb00cd3201ba2b17d635e5cff253ce41c
MD5 141b6a8ae5f203906c41b2b2de3497c5
BLAKE2b-256 69931216ca0152bbbe9a7e134fd062b5db9c6f310e52d7f7cd8a4e399d9a4485

See more details on using hashes here.

File details

Details for the file img2nl-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: img2nl-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 56.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for img2nl-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 0fd0fa2daf20c16519715e7d02384b87ab7bccaa94693fc6c16f26c87d3de12d
MD5 279bfdf6104f8441b93156fe28d7bdb3
BLAKE2b-256 d61943a180c1d74fbb0f9e23797e1c1d3463517cf62eaf74abb54a87349b8024

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page