Skip to main content

LLM-based image categorization tool with focus on book cover detection

Project description

LLM Book Cover Detector

A Python package for detecting and analyzing book covers using Qwen Vision-Language model.

Features

  • Accurate book cover detection
  • Similarity scoring (0-100%)
  • Concise reasoning
  • Beautiful CLI interface
  • JSON response format
  • Raw API response display
  • Rich output formatting
  • Comprehensive error handling

Installation

pip install llm_img_cat

API Key Setup

You need a DashScope API key to use this package. Here are three ways to set it up:

  1. Using Environment Variable (Recommended for Development)

    export DASHSCOPE_API_KEY="your-api-key-here"
    
  2. Using .env File (Recommended for Projects) Create a .env file in your project directory:

    DASHSCOPE_API_KEY=your-api-key-here
    DEFAULT_MODEL=qwen-vl-plus  # Optional
    

    The package will automatically load the API key from this file.

  3. Setting Programmatically (For Testing)

    import os
    os.environ['DASHSCOPE_API_KEY'] = 'your-api-key-here'
    from llm_img_cat import ImageCategorizer
    

⚠️ Security Note: Never commit your API key to version control. Always use environment variables or .env files, and keep your API key secure.

Usage

CLI Usage

The simplest way to use the book cover detector is through the CLI:

python scripts/llm_img_cat_cli.py path/to/image.jpg

This will:

  1. Analyze if the image is a book cover
  2. Provide a similarity score (0-100%)
  3. Give a concise 5-word reasoning
  4. Show raw API response

Python API Usage

from llm_img_cat import ImageCategorizer

# Initialize the categorizer (will automatically load API key from environment or .env)
categorizer = ImageCategorizer()

# Analyze an image
result = categorizer.categorize_image("path/to/image.jpg")

print(f"Is book cover: {result['is_category']}")
print(f"Similarity score: {result['confidence']}%")
print(f"Reasoning: {result['reasoning']}")

Rich Console Demo

For a more sophisticated example with beautiful console output and error handling, check out the rich demo:

# Install required packages
pip install llm_img_cat rich

# Run the demo
python examples/rich_demo.py

The rich demo showcases:

  • Beautiful console output with colors and panels
  • Proper environment setup
  • Comprehensive error handling
  • Test summary reporting
  • Multiple image processing

Example output:

╭───────────────────────────────╮
│ Environment Setup             │
│ API Key: ✓ Set                │
│ Model: qwen2.5-vl-3b-instruct │
╰───────────────────────────────╯

Analyzing image: Book cover example
╭──────────────────────────────────────╮
│ Analysis Results                     │
│ Is Book Cover: True                  │
│ Confidence: 95%                      │
│ Reasoning: Classic book cover design │
╰──────────────────────────────────────╯

Example Output

╭── Book Cover Detection Results ───╮
│ Is Book Cover    │ Yes           │
│ Similarity Score │ 90%           │
╰────────────────────────────────╯
╭── Reasoning ──────────────────────╮
│ Text and design typical of books  │
╰────────────────────────────────╯

Configuration

Required environment variables:

  • DASHSCOPE_API_KEY: Your DashScope API key
  • DEFAULT_MODEL: Model to use (default: "qwen-vl-plus")

Development

  • Run tests: ./run_qwen_tests.sh
  • Check code: scripts/lint.sh
  • Build docs: scripts/build_docs.sh

License

MIT License

Contributing

See CONTRIBUTING.md for guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_img_cat-0.1.2.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_img_cat-0.1.2-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file llm_img_cat-0.1.2.tar.gz.

File metadata

  • Download URL: llm_img_cat-0.1.2.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for llm_img_cat-0.1.2.tar.gz
Algorithm Hash digest
SHA256 221083b97ae86e654c806496912dc31d3cb5e1b9acf5ca730289e57d62e96297
MD5 cd54b9478930481bc1c6228dca8fe645
BLAKE2b-256 8b92e5b4bf7090f99ec6097485d9c0689987b1555a0ed8cf8c22c214ee54f8e3

See more details on using hashes here.

File details

Details for the file llm_img_cat-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: llm_img_cat-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for llm_img_cat-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 29ef0132498ab6868733a24a045784d4ee6ecbb7a518ae6bc9bb0ba8a2dab37c
MD5 c56f674f108ef089d0308bfb87af13db
BLAKE2b-256 37414a2a55d1d02f55e26e99eb133321d46b2181950c6530e805931a669ae0b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page