Image → natural language summary with heuristics, thumbnails, and LLM transport hints
Project description
img2nl
AI Cost Tracking
- 🤖 LLM usage: $1.7439 (6 commits)
- 👤 Human dev: ~$382 (3.8h @ $100/h, 30min dedup)
Generated on 2026-06-09 using openrouter/qwen/qwen3-coder-next
Heuristic image → natural language summary for transport to LLM and other services.
No vision LLM required for the core path — uses layered heuristics (Pillow, optional OpenCV, perceptual hash, conditional OCR/QR/YOLO).
Features
| Layer | Module | Extra | What it detects |
|---|---|---|---|
| 0 | colors, dynamics, noise, objects, patterns |
[analyze] |
palette, contrast, flat regions, UI blocks |
| 1 | edges |
[opencv] |
blur, edge density, text likelihood |
| 2 | fingerprint, similarity |
[similarity] |
pHash/dHash/wHash, screen match |
| 3 | special_hits |
[scan] / [ocr] |
QR/barcode, OCR (conditional) |
| 4 | semantic_hits |
[detect] |
YOLO labels (opt-in) |
Output includes features.scene.scene_class, llm_hint, and optional similarity when reference_fingerprint is passed.
Full architecture: docs/detection-pipeline.md
Install
pip install -e ".[analyze]" # core (Pillow + NumPy)
pip install -e ".[full]" # analyze + opencv + similarity + scan
pip install -e ".[analyze,translate]" # + argostranslate offline
bash install-dev.sh # full *2img2nl stack
Optional extras: opencv, similarity, scan, ocr, detect — see pyproject.toml.
CLI
img2nl analyze photo.png --json
img2nl analyze photo.png --locale de --translate-mode offline
dsl2img2nl -c "ANALYZE photo.png" --json
uri2img2nl query "img2nl://analyze?path=photo.png&locale=pl"
python -c "from img2nl.i18n import supported_locales; print(supported_locales())"
Offline translation (argostranslate)
Static catalog covers 38 European langs; for scalable updates use neural offline translation:
pip install img2nl[analyze,translate]
img2nl translate-install en pl
img2nl analyze photo.png --locale de --translate-mode offline
Modes: auto (catalog pl/en, else argos), offline (require argos), catalog (JSON only).
Python API
from img2nl import analyze_image
result = analyze_image("screen.png", skip_thumbnail=True)
print(result.text, result.features["scene"]["scene_class"], result.llm_hint)
# compare with previous capture
prev = analyze_image("screen_a.png", skip_thumbnail=True)
cur = analyze_image(
"screen_b.png",
skip_thumbnail=True,
reference_fingerprint=prev.features["fingerprint"],
enable_detect=False, # True → YOLO (heavy)
)
print(cur.features.get("similarity", {}))
Packages
| Package | Role |
|---|---|
img2nl |
Core heuristics + describe + thumbnail + layered detection |
uri2img2nl |
img2nl:// URI layer |
dsl2img2nl |
DSL bus (ANALYZE, QUERY, LLM_HINT) |
cli2img2nl |
Shell adapter |
See packages/README.md.
VQL integration
Layered pipeline with img2vql in oqlos/vql:
pip install -e ".[analyze,similarity,opencv,scan]"
pip install -e ~/github/oqlos/vql/packages/img2vql
# adopt → metadata (fingerprint, special_hits, scene_class)
uri2vql analyze-window --image capture.png --out app.vql.json
# smart skip when screen unchanged
uri2vql analyze-window --image capture.png --out app.vql.json
uri2vql refresh-window --vql-program app.vql.json --image capture.png
uri2vql compare-window --vql-program app.vql.json --image capture.png
img2vql diagnose capture.png --vql-program app.vql.json --save
uri2vql resolve "odśwież metadata vql" --file app.vql.json --image capture.png
# end-to-end demo
bash ~/github/oqlos/vql/examples/img2nl-vql-flow.sh capture.png app.vql.json
Docs
| Doc | Content |
|---|---|
| docs/detection-pipeline.md | Warstwy 0–4, schema JSON, VQL cache |
| CHANGELOG.md | Historia zmian |
| TODO.md | Backlog |
License
Licensed under Apache-2.0.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file img2nl-0.1.7.tar.gz.
File metadata
- Download URL: img2nl-0.1.7.tar.gz
- Upload date:
- Size: 721.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15e30c3af899399f11f58eb3110624197fea367e36ba9970a6e2806319220ccc
|
|
| MD5 |
8b246d5f668b1b49210476b5e6d95b26
|
|
| BLAKE2b-256 |
213e42095dce24b9e4fe0b80fdaa770ffdc433e5df2da1d306eee9cf30feed66
|
File details
Details for the file img2nl-0.1.7-py3-none-any.whl.
File metadata
- Download URL: img2nl-0.1.7-py3-none-any.whl
- Upload date:
- Size: 56.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3917ef2c183346bacc0451c0c7d8344ce5968e256ab8174ffffdde8a17c2ad9
|
|
| MD5 |
4e7d89a3930c87b00ae672cd68169e13
|
|
| BLAKE2b-256 |
ba3874f1f2ba458f376c53fcf7b1cd458b91732ff2e6af0edcca30aa92c94177
|