LLM-based image categorization tool with focus on book cover detection
Project description
LLM Book Cover Detector
A Python package for detecting and analyzing book covers using Qwen Vision-Language model.
Features
- Accurate book cover detection
- Similarity scoring (0-100%)
- Concise reasoning
- Beautiful CLI interface
- JSON response format
- Raw API response display
- Rich output formatting
Installation
- Clone the repository:
git clone <repository-url>
cd llm_image_categorizator
- Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Configure environment:
cp .env.template .env
# Edit .env with your DASHSCOPE_API_KEY
Usage
CLI Usage
The simplest way to use the book cover detector is through the CLI:
python scripts/llm_img_cat_cli.py path/to/image.jpg
This will:
- Analyze if the image is a book cover
- Provide a similarity score (0-100%)
- Give a concise 5-word reasoning
- Show raw API response
Python API Usage
from llm_img_cat.categorizer import llm_img_cat
# Analyze an image
result = llm_img_cat("path/to/image.jpg")
print(f"Is book cover: {result['is_category']}")
print(f"Similarity score: {result['confidence']}%")
print(f"Reasoning: {result['reasoning']}")
Example Output
╭── Book Cover Detection Results ───╮
│ Is Book Cover │ Yes │
│ Similarity Score │ 90% │
╰────────────────────────────────╯
╭── Reasoning ──────────────────────╮
│ Text and design typical of books │
╰────────────────────────────────╯
Configuration
Required environment variables in .env:
DASHSCOPE_API_KEY: Your Qwen API keyDEFAULT_MODEL: Default is "qwen2.5-vl-3b-instruct"
Development
- Run tests:
./run_qwen_tests.sh - Check code:
scripts/lint.sh - Build docs:
scripts/build_docs.sh
License
MIT License
Contributing
See CONTRIBUTING.md for guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_img_cat-0.1.0.tar.gz.
File metadata
- Download URL: llm_img_cat-0.1.0.tar.gz
- Upload date:
- Size: 51.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45b3da186f4ef8917da697d288252424f372f705572c6ed8f979742d562ded9b
|
|
| MD5 |
dbe1113688d8ebee38eeb62764ce44b0
|
|
| BLAKE2b-256 |
ccb86d9b2b9814d78ac53b6498adf8f726e807f791e032ba4602d528ec157275
|
File details
Details for the file llm_img_cat-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_img_cat-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bb3116b6b3c0e4e882fd1ee5473041d604d8716dd7806c90b3a3f60d98638a7
|
|
| MD5 |
cf54fd9ab565c091934365938f25a2ec
|
|
| BLAKE2b-256 |
c65211c37babcaafc383fbf0324a43ce17e0a58ceb8811fda09c5cca374fd019
|