Skip to main content

LLM-based image categorization tool with focus on book cover detection

Project description

LLM Book Cover Detector

A Python package for detecting and analyzing book covers using Qwen Vision-Language model.

Features

  • Accurate book cover detection
  • Similarity scoring (0-100%)
  • Concise reasoning
  • Beautiful CLI interface
  • JSON response format
  • Raw API response display
  • Rich output formatting

Installation

  1. Clone the repository:
git clone <repository-url>
cd llm_image_categorizator
  1. Create and activate virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure environment:
cp .env.template .env
# Edit .env with your DASHSCOPE_API_KEY

Usage

CLI Usage

The simplest way to use the book cover detector is through the CLI:

python scripts/llm_img_cat_cli.py path/to/image.jpg

This will:

  1. Analyze if the image is a book cover
  2. Provide a similarity score (0-100%)
  3. Give a concise 5-word reasoning
  4. Show raw API response

Python API Usage

from llm_img_cat.categorizer import llm_img_cat

# Analyze an image
result = llm_img_cat("path/to/image.jpg")

print(f"Is book cover: {result['is_category']}")
print(f"Similarity score: {result['confidence']}%")
print(f"Reasoning: {result['reasoning']}")

Example Output

╭── Book Cover Detection Results ───╮
│ Is Book Cover    │ Yes           │
│ Similarity Score │ 90%           │
╰────────────────────────────────╯
╭── Reasoning ──────────────────────╮
│ Text and design typical of books  │
╰────────────────────────────────╯

Configuration

Required environment variables in .env:

  • DASHSCOPE_API_KEY: Your Qwen API key
  • DEFAULT_MODEL: Default is "qwen2.5-vl-3b-instruct"

Development

  • Run tests: ./run_qwen_tests.sh
  • Check code: scripts/lint.sh
  • Build docs: scripts/build_docs.sh

License

MIT License

Contributing

See CONTRIBUTING.md for guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_img_cat-0.1.0.tar.gz (51.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_img_cat-0.1.0-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file llm_img_cat-0.1.0.tar.gz.

File metadata

  • Download URL: llm_img_cat-0.1.0.tar.gz
  • Upload date:
  • Size: 51.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.8.10

File hashes

Hashes for llm_img_cat-0.1.0.tar.gz
Algorithm Hash digest
SHA256 45b3da186f4ef8917da697d288252424f372f705572c6ed8f979742d562ded9b
MD5 dbe1113688d8ebee38eeb62764ce44b0
BLAKE2b-256 ccb86d9b2b9814d78ac53b6498adf8f726e807f791e032ba4602d528ec157275

See more details on using hashes here.

File details

Details for the file llm_img_cat-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llm_img_cat-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for llm_img_cat-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6bb3116b6b3c0e4e882fd1ee5473041d604d8716dd7806c90b3a3f60d98638a7
MD5 cf54fd9ab565c091934365938f25a2ec
BLAKE2b-256 c65211c37babcaafc383fbf0324a43ce17e0a58ceb8811fda09c5cca374fd019

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page