Skip to main content

Batch catalogue physical collections using visual dividers (QR codes) and automated image processing

Project description

visual-cataloguer

ai cataloging cli-tool computer-vision image-processing inventory-management machine-learning python qr-code web-app

Batch catalogue physical collections using visual dividers (QR codes) and automated image processing.

The Problem

You have thousands of items (retro games, books, vinyl, trading cards) stored in boxes. You need them in a searchable database so you can find things and list them on eBay. Manual entry would take weeks.

The Solution

  1. Print QR code dividers (one per storage location)
  2. Photograph items one at a time, using dividers to mark location changes
  3. Run viscatalog process ./photos --ai-identify
  4. Search and browse your collection via CLI or web UI

Scope & Philosophy

This tool does one thing well: Process a folder of photographs into a searchable inventory database.

What it handles:

  • One item per photograph (a "complete in box" game with box + manual + cartridge counts as one item)
  • QR code dividers to track storage locations (BOX-1, SHELF-A3, etc.)
  • Black frame images to end location sequences
  • AI identification of items (title, platform, condition, completeness)
  • Reference-quality images for verification (not eBay listing photos)

What it doesn't do:

  • Multi-item segmentation (photographing 5 cartridges and splitting them)
  • Image preprocessing (rotation, cropping, alignment)
  • eBay-ready photo processing

The images stored are for verifying the database is correct, not for direct use in listings. When you're ready to list an item, you'll retrieve it from storage and take proper detailed photos.

Workflow

1. Prepare Dividers

Option A: QR codes (recommended) Print QR codes containing location IDs:

BOX-1    BOX-2    SHELF-A1    GARAGE-BIN-3

Option B: Hand-written/printed text Write the location ID on white paper. When AI mode is enabled, the tool will read text dividers automatically. Keep the paper clean - just the location text on white background.

Black frames: Put your hand over the lens to create a black image. This signals the end of a location sequence.

2. Photograph Your Collection

[QR: BOX-1] → [Item] → [Item] → [Item] → [BLACK] → [QR: BOX-2] → [Item] → ...

Rules:

  • One item per photo
  • Complete sets (box + game + manual) in one photo = one item
  • QR code starts a new location
  • Black frame ends the current location (optional, but helps with organization)
  • Multiple cameras OK - images merge by EXIF timestamp

3. Process Images

# Default: Auto-detects AI (tries Ollama first, then Claude)
viscatalog process ./photos

# Force a specific provider
viscatalog process ./photos --provider ollama
viscatalog process ./photos --provider claude

# Offline mode: QR/OCR only, no AI
viscatalog process ./photos --offline

# Use a specific model
viscatalog process ./photos --provider ollama --model llava:13b

The tool auto-detects available AI providers: Ollama (free, local) is preferred, with Claude as fallback if ANTHROPIC_API_KEY is set.

4. Review & Correct

# View stats
viscatalog stats

# List items needing review
viscatalog list --unknown           # No title identified
viscatalog list --low-confidence    # AI uncertain
viscatalog list --needs-review      # Flagged for review

# View item details
viscatalog show 42

# Manually correct
viscatalog edit 42 --title "Super Mario Bros." --platform NES

# Re-run AI identification
viscatalog reidentify 42 --provider claude

5. Browse & Search

# Search
viscatalog search "zelda"

# List by location
viscatalog list --location BOX-1

# Web interface
viscatalog serve --port 8000
# Open http://localhost:8000

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Load Image  │────▶│     AI      │────▶│   Process   │
│ (ARW/JPG)   │     │  Classify   │     │ Accordingly │
└─────────────┘     └─────────────┘     └─────────────┘
                           │
           ┌───────────────┼───────────────┐
           ▼               ▼               ▼
     ┌──────────┐    ┌──────────┐    ┌──────────┐
     │ LOCATION │    │  BLACK   │    │   ITEM   │
     │ DIVIDER  │    │  FRAME   │    │          │
     └──────────┘    └──────────┘    └──────────┘
          │               │               │
          ▼               ▼               ▼
     Set current     Clear current    Store item
     location        location         in database

AI-First Architecture: A single AI call classifies the image type AND identifies item details. This handles QR codes, handwritten text, stylized logos, and Japanese text seamlessly.

AI Identification

By default, the tool uses vision AI (Ollama) to classify and identify items:

Field Example
Title "Super Mario Bros. 3"
Platform "NES"
Item Type game, console, controller, book, vinyl, etc.
Completeness loose, boxed, complete_set, sealed
Brand "Nintendo"
Region NTSC-U, PAL, NTSC-J
Year "1990"
Condition mint, good, fair, poor
Condition Notes "Box has shelf wear on corners, manual has light creasing"

Supports:

  • Claude (default) - requires ANTHROPIC_API_KEY environment variable
  • Ollama - local, free, use --ai-provider ollama

Features

  • Multi-camera support: Merges photos from multiple cameras by EXIF timestamp
  • RAW file support: Sony ARW, Canon CR2/CR3, Nikon NEF, and more via rawpy
  • QR code detection: OpenCV-based with OCR fallback
  • SQLite database: Single portable file with images stored as BLOBs
  • Resume capability: SHA256 deduplication skips already-processed files
  • Web interface: Browse, search, edit, and manage your collection
  • REST API: Integrate with other tools
  • Duplicate divider handling: Multiple QR codes or black frames in a row are handled gracefully

Installation

# From PyPI
pip install visual-cataloguer

# With web interface
pip install visual-cataloguer[web]

# Or with uv
uv pip install visual-cataloguer[web]

System dependencies:

  • Tesseract OCR: brew install tesseract (macOS) or apt install tesseract-ocr (Linux)

For AI identification:

  • Claude: Set ANTHROPIC_API_KEY environment variable
  • Ollama: Install from ollama.ai and run ollama pull llava

CLI Reference

# Processing (auto-detects AI provider)
viscatalog process <input-dir> [--provider auto|ollama|claude] [--model MODEL] [--offline]

# Viewing
viscatalog stats
viscatalog list [--location LOC] [--platform PLAT] [--unknown] [--low-confidence] [--needs-review]
viscatalog search <query>
viscatalog show <item-id> [--json]

# Editing
viscatalog edit <item-id> [--title T] [--platform P] [--completeness C] [--notes N]
viscatalog reidentify <item-id> [--provider P] [--model M] [--image PATH]
viscatalog review <item-id> [--done | --flag --reason R]

# Export
viscatalog export <output-dir>                    # Export images by location
viscatalog export ./data.csv --format csv         # Export metadata as CSV
viscatalog export ./data.json --format json       # Export metadata as JSON
viscatalog export ./images --include-metadata     # Images + JSON sidecar files

# Web server
viscatalog serve [--port 8000] [--host 0.0.0.0]

API Endpoints

# Items
GET    /api/items                    # List items (with filters)
GET    /api/items/{id}               # Get item
PATCH  /api/items/{id}               # Update item
DELETE /api/items/{id}               # Delete item
POST   /api/items/{id}/reidentify    # Re-run AI identification

# Images
GET    /api/items/{id}/images        # List item images
GET    /api/items/{id}/image/thumb   # Get thumbnail
GET    /api/items/{id}/image/full    # Get full image
POST   /api/items/{id}/images        # Upload image

# Other
GET    /api/locations                # List locations
GET    /api/search?q=query           # Search items
GET    /api/stats                    # Collection statistics

Full OpenAPI docs at http://localhost:8000/docs

Database Schema

SQLite database with tables:

  • items - Catalogued items with metadata
  • item_images - Images stored as JPEG BLOBs (one-to-many with items)
  • locations - Storage locations (BOX-1, SHELF-A3, etc.)
  • processing_log - Tracks processed files for resume

Development

git clone https://github.com/retroverse-studios/visual-cataloguer.git
cd visual-cataloguer
uv sync --extra web --extra dev

# Run tests
uv run pytest

# Type checking
uv run mypy cataloguer

# Linting
uv run ruff check cataloguer

License

MIT License - see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visual_cataloguer-0.9.9.tar.gz (15.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

visual_cataloguer-0.9.9-py3-none-any.whl (128.3 kB view details)

Uploaded Python 3

File details

Details for the file visual_cataloguer-0.9.9.tar.gz.

File metadata

  • Download URL: visual_cataloguer-0.9.9.tar.gz
  • Upload date:
  • Size: 15.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for visual_cataloguer-0.9.9.tar.gz
Algorithm Hash digest
SHA256 a455dee2eca8c4e3afcb1ce9dfa48a59a32c27b84ea4164702916ea9635b0963
MD5 f43ab010203c3ea323632e6c4fd66f14
BLAKE2b-256 7f6ad818a73335284de7f55cc899b911133e440019ab7607e8137f30cc8d227e

See more details on using hashes here.

File details

Details for the file visual_cataloguer-0.9.9-py3-none-any.whl.

File metadata

File hashes

Hashes for visual_cataloguer-0.9.9-py3-none-any.whl
Algorithm Hash digest
SHA256 1e6d886ddcbd1b8474ff0ebeadf9b1c0771bf2d8f71b6d68b0f0df240e988811
MD5 cc714c422fbf4b11d834806172edcb70
BLAKE2b-256 4c43bf09cf9acb2590537b1ce5be38e3adc8d5ef473a8d329c9fe0327e80b126

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page