A Python-based command-line system for managing a philatelic (stamp) collection.
Project description
Philately Collection Management System
Save days or weeks of tedious data entry with tweezers and a magnifying glass.
Overview
This project is a Python-based command-line system for managing a
philatelic (stamp) collection. It leverages modern AI to process entire
directories of stamp album images, extract detailed metadata, and
generate a comprehensive, queryable inventory. By using litellm, it
supports multiple AI model providers (e.g., Google Gemini, xAI Grok) for
maximum flexibility and cost-effectiveness.
Key features include:
- Multi-Model AI Processing: Analyzes stamp images to extract details like country, year, and condition using a two-pass system with configurable "low-cost" and "high-cost" vision models.
- Data Enrichment: Uses powerful text models to enrich the initial data with estimated values, historical context, and philatelic remarks.
- False Positive Detection: Includes a dedicated phase to re-examine high-value items and automatically flag illustrations or other non-stamp entities.
- Persistent, Auditable Storage: Maintains a master inventory in
master_inventory.csvthat includes all processed stamps, deacquired items, and verification results. - Comprehensive Reporting: Generates detailed JSON summaries, high-value reports, and content-ready CSVs for platforms like Substack.
- Modular, Phase-Based Execution: Allows you to run the entire pipeline or specific phases (e.g., analysis, enrichment, reporting) independently.
- Command-Line and GUI Interfaces: Provides both a command-line tool (
philately) for automated processing and a Streamlit-based GUI (philately-ui) for interactive use.
Prerequisites
- Python: Version 3.13 or higher.
- API Keys: At least one API key for a supported provider (e.g.,
Google, xAI). These should be set in a
.envfile. - System Dependencies:
- On Ubuntu/Debian:
sudo apt-get install libopencv-dev. - On macOS:
brew install opencv.
- On Ubuntu/Debian:
Installation
-
Clone the Repository:
git clone <repository-url> cd <repository-directory> -
Create a Virtual Environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate -
Install the Package:
pip install . -
Set Up Environment Variables:
Create a
.envfile in the project root and add your API key(s):echo "GOOGLE_API_KEY=your-google-api-key" > .env echo "XAI_API_KEY=your-xai-api-key" >> .env -
Prepare Directory Structure:
- Place stamp images in a directory (e.g.,
stamps/), organized into subdirectories for each album (e.g.,stamps/Isle of Man/). - The
outputdirectory will be created automatically to store all generated files.
- Place stamp images in a directory (e.g.,
Usage
This package provides two primary entry points: a command-line interface (CLI) and a graphical user interface (GUI).
Command-Line Interface (CLI)
The philately command allows you to run the entire pipeline or specific phases using command-line flags.
Command-Line Flags
The command-line flags are the same as described in the table below.
Graphical User Interface (GUI)
The philately-ui command launches a Streamlit-based web interface that allows you to configure and run the processing pipeline interactively.
philately-ui
Example Commands (CLI)
1. Run the full pipeline on all images:
philately --image-dir ./stamps --output-dir ./output
2. Run only the image analysis phase on the first 10 images:
philately --run-analysis --max-images 10
3. Run the false-positive check on the top 3 most valuable stamps with debug logging:
philately --run-false-positive-check --false-positive-check-limit 3 --debug
4. Generate a Substack export with the top 20 most valuable items:
philately --run-substack-export --substack-items 20
5. Re-run only the enrichment and summary phases:
philately --run-enrichment --run-summaries
Command-Line Flags
| Flag | Default | Description |
|---|---|---|
--image-dir |
stamps |
Directory containing stamp images organized in album folders. |
--output-dir |
output |
Directory to save all outputs. |
--confidence-threshold |
5 |
Confidence score (1-7) below which to trigger re-analysis with a high-cost model. |
--max-images |
None |
Limit the number of images to process for testing. |
--high-value-threshold |
1000 |
USD threshold to consider a stamp as high-value for reporting. |
--debug |
False |
Enable debug-level logging for verbose output, including API payloads. |
--low-cost-model |
gemini/gemini-1.5-flash-latest |
The vision model for the initial, low-cost pass. |
--high-cost-model |
gemini/gemini-1.5-pro-latest |
The vision model for the high-confidence re-analysis pass. |
--narrative-model |
gemini/gemini-1.5-pro-latest |
The text model for enrichment and summaries. |
--collection-summary-model |
gemini/gemini-1.5-pro-latest |
The high-context model for the final collection-wide summary. |
--run-analysis |
False |
Run only the image analysis phase. |
--run-enrichment |
False |
Run only the philatelic enrichment phase. |
--run-summaries |
False |
Run the full clustering and summary phase. |
--run-high-value-report |
False |
Run only the high-value stamp report generation phase. |
--run-collection-summary-only |
False |
Run only the final collection-wide summary generation. |
--run-false-positive-check |
False |
Run a re-examination of high-value stamps to find false positives. |
--false-positive-check-limit |
5 |
Limit the number of stamps to check in the false-positive phase (0 for all). |
--run-substack-export |
False |
Generate a CSV export formatted for Substack posts. |
--substack-items |
10 |
Number of top items to include in the Substack export (0 for all). |
Output Files
All outputs are saved to the directory specified by --output-dir.
master_inventory.csv: The master database of all stamps, including detailed analysis and verification data.stamp_inventory.json: A structured JSON file containing all data, including collection-wide statistics and narrative summaries.false_positive_check_report.csv: A summary of high-value items that were checked for authenticity.high_value_summary.csv: A CSV listing all stamps identified as high-value.substack_export.csv: A CSV formatted for easy import into content platforms like Substack.cropped_entities/: Directory of cropped images for each identified stamp.thumbnails/: Directory of 100x100px thumbnails for each stamp.high_value_reports/: Individual Markdown reports for each high-value stamp.
Example Data Records
1. Master Inventory Record (master_inventory.csv)
A single row contains the complete data for one stamp.
| stamp_id | album | page_filename | common_name | nationality | year | face_value | condition | confidence | estimated_value_high | is_verified_real | verification_reason |
|---|---|---|---|---|---|---|---|---|---|---|---|
a1b2c3d4-... |
Isle of Man |
IMG_1172.jpeg |
1973 Manx Cat |
Isle of Man |
1973 |
3p |
Mint |
7 |
15 |
True |
This appears to be a genuine, mounted stamp with clear perforations and color. |
2. Cluster Summary (stamp_inventory.json)
Summaries provide statistics and a narrative for a specific group of stamps (e.g., an album).
{
"album_Isle_of_Man": {
"statistics": {
"item_count": 58,
"album_count": 1,
"countries_represented": 1,
"year_range": "1973 - 1998",
"total_value_low": 150,
"total_value_high": 450,
"condition_distribution": {
"Mint": 45,
"Used": 13
}
},
"narrative_summary": "This cluster from the 'Isle of Man' album represents a strong collection of modern issues, primarily from the 1970s and 1980s. The thematic focus is on local culture, transportation, and fauna, with the 'Manx Cat' and 'TT Races' series being prominent highlights. The overall condition is excellent, with a majority of items in mint condition. A notable gap is the absence of earlier Victorian-era issues."
}
}
3. False Positive Check Report (false_positive_check_report.csv)
This report provides a clear audit trail for the verification process.
| stamp_id | common_name | estimated_value_high | page_filename | is_verified_real | cropped_image_path | verification_reason | action_taken |
|---|---|---|---|---|---|---|---|
e5f6g7h8-... |
Penny Black |
2500 |
IMG_1245.JPG |
False |
cropped_entities/e5f6g7h8-..._cropped.jpg |
The image is a black and white printed illustration, lacking color and physical depth. |
Marked as deacquired (illustration) |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file philately_will_get_you_everywhere-0.1.0.tar.gz.
File metadata
- Download URL: philately_will_get_you_everywhere-0.1.0.tar.gz
- Upload date:
- Size: 39.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8df3a029a6a61c5af212d55393d714965bf4a08ad45e6133dfd9e3c17aa45200
|
|
| MD5 |
830b34a446699fb095aeb4c2ccaa9712
|
|
| BLAKE2b-256 |
188615b496056169408771cfd0063b8cc7af5ea4095295bce36748111797abe6
|
File details
Details for the file philately_will_get_you_everywhere-0.1.0-py3-none-any.whl.
File metadata
- Download URL: philately_will_get_you_everywhere-0.1.0-py3-none-any.whl
- Upload date:
- Size: 38.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44ec7bd7c5635ab418eb6ff865b0d062be2c51ee93756ccce2dd5e2e8d58382f
|
|
| MD5 |
3ccbc030976709f296b7a4c3c9292de8
|
|
| BLAKE2b-256 |
c48d719009a0757d1b53a4752cb9a929448fa65f3379b867cb11c826a40e4e2e
|