LLM-based image categorization tool with focus on book cover detection
Project description
LLM Book Cover Detector
A Python package for detecting and analyzing book covers using Qwen Vision-Language model.
Features
- Accurate book cover detection
- Similarity scoring (0-100%)
- Concise reasoning
- Beautiful CLI interface
- JSON response format
- Raw API response display
- Rich output formatting
- Comprehensive error handling
Installation
pip install llm_img_cat
API Key Setup
You need a DashScope API key to use this package. Here are three ways to set it up:
-
Using Environment Variable (Recommended for Development)
export DASHSCOPE_API_KEY="your-api-key-here"
-
Using .env File (Recommended for Projects) Create a
.envfile in your project directory:DASHSCOPE_API_KEY=your-api-key-here DEFAULT_MODEL=qwen-vl-plus # OptionalThe package will automatically load the API key from this file.
-
Setting Programmatically (For Testing)
import os os.environ['DASHSCOPE_API_KEY'] = 'your-api-key-here' from llm_img_cat import ImageCategorizer
⚠️ Security Note: Never commit your API key to version control. Always use environment variables or .env files, and keep your API key secure.
Usage
CLI Usage
The simplest way to use the book cover detector is through the CLI:
python scripts/llm_img_cat_cli.py path/to/image.jpg
This will:
- Analyze if the image is a book cover
- Provide a similarity score (0-100%)
- Give a concise 5-word reasoning
- Show raw API response
Python API Usage
from llm_img_cat import ImageCategorizer
# Initialize the categorizer (will automatically load API key from environment or .env)
categorizer = ImageCategorizer()
# Analyze an image
result = categorizer.categorize_image("path/to/image.jpg")
print(f"Is book cover: {result['is_category']}")
print(f"Similarity score: {result['confidence']}%")
print(f"Reasoning: {result['reasoning']}")
Rich Console Demo
For a more sophisticated example with beautiful console output and error handling, check out the rich demo:
# Install required packages
pip install llm_img_cat rich
# Run the demo
python examples/rich_demo.py
The rich demo showcases:
- Beautiful console output with colors and panels
- Proper environment setup
- Comprehensive error handling
- Test summary reporting
- Multiple image processing
Example output:
╭───────────────────────────────╮
│ Environment Setup │
│ API Key: ✓ Set │
│ Model: qwen2.5-vl-3b-instruct │
╰───────────────────────────────╯
Analyzing image: Book cover example
╭──────────────────────────────────────╮
│ Analysis Results │
│ Is Book Cover: True │
│ Confidence: 95% │
│ Reasoning: Classic book cover design │
╰──────────────────────────────────────╯
Example Output
╭── Book Cover Detection Results ───╮
│ Is Book Cover │ Yes │
│ Similarity Score │ 90% │
╰────────────────────────────────╯
╭── Reasoning ──────────────────────╮
│ Text and design typical of books │
╰────────────────────────────────╯
Configuration
Required environment variables:
DASHSCOPE_API_KEY: Your DashScope API keyDEFAULT_MODEL: Model to use (default: "qwen-vl-plus")
Development
- Run tests:
./run_qwen_tests.sh - Check code:
scripts/lint.sh - Build docs:
scripts/build_docs.sh
License
MIT License
Contributing
See CONTRIBUTING.md for guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_img_cat-0.1.2.tar.gz.
File metadata
- Download URL: llm_img_cat-0.1.2.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
221083b97ae86e654c806496912dc31d3cb5e1b9acf5ca730289e57d62e96297
|
|
| MD5 |
cd54b9478930481bc1c6228dca8fe645
|
|
| BLAKE2b-256 |
8b92e5b4bf7090f99ec6097485d9c0689987b1555a0ed8cf8c22c214ee54f8e3
|
File details
Details for the file llm_img_cat-0.1.2-py3-none-any.whl.
File metadata
- Download URL: llm_img_cat-0.1.2-py3-none-any.whl
- Upload date:
- Size: 5.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29ef0132498ab6868733a24a045784d4ee6ecbb7a518ae6bc9bb0ba8a2dab37c
|
|
| MD5 |
c56f674f108ef089d0308bfb87af13db
|
|
| BLAKE2b-256 |
37414a2a55d1d02f55e26e99eb133321d46b2181950c6530e805931a669ae0b8
|