AI-powered file renaming tool with LLM integration
Project description
๐ OnomaTool: AI-Powered File Renamer ๐
โจ What is OnomaTool?
OnomaTool is your AI-powered file renaming assistant! ๐ง โจ
- Rename files in bulk with smart, context-aware suggestions ๐ค
- Supports PDFs, images, markdown, SVG, PPTX, DOCX, TXT, and more! ๐๐ผ๏ธ
- Always preserves file extensions ๐
- CLI with dry-run, interactive, verbose, and debug modes ๐ฅ๏ธ
- Configurable via
.onomarcTOML config file โ๏ธ - Uses Markitdown for unified file processing ๐
- NEW: Advanced UTF-8 encoding detection and conversion for text files ๐ค
- NEW: Configurable word count limits for filenames ๐ค
๐ Features
Core Functionality
- ๐ฆพ AI Suggestions: Get 3 smart file name ideas for every file
- ๐ค Multiple LLM Providers: OpenAI (including local endpoints) and Google Gemini support
- ๐งฉ Conflict Resolution: Never overwrite files - automatic numeric suffix handling
- ๐ Extension Preservation: Original file extensions are always preserved
- ๐ Glob Pattern Support: Process files using flexible glob patterns
File Processing
- ๐ PDF Files: Extract markdown content + generate images for each page
- ๐ผ๏ธ SVG Files: Convert to PNG for AI analysis (enforced PNG-only processing)
- ๐ PPTX Files: Extract content + generate images for each slide using LibreOffice
- ๐ Text Files: UTF-8 encoding detection and conversion + markdown processing
- ๐ผ๏ธ Image Files: Base64 encoding for direct AI image analysis
- ๐ Office Documents: DOCX, XLSX support via Markitdown
- ๐ค Unicode Support: Automatic encoding detection for text files with chardet
CLI Modes
- ๐งช Dry-Run Mode: Preview changes without modifying files (
--dry-run) - ๐ค Interactive Mode: Confirm changes after dry-run preview (
--interactive) - ๐ Debug Mode: Preserve temp files and show processing paths (
--debug) - ๐ข Verbose Mode: Show LLM requests and responses (
--verbose) - โ๏ธ Config Generation: Generate default config file (
--save-config)
Advanced Features
- ๐ฏ Smart Processing: Combined image + text analysis for documents
- ๐๏ธ Modular Architecture: Extensible processor system
- ๐ Local LLM Support: Works with local OpenAI-compatible endpoints
- ๐ Multiple Naming Conventions: snake_case, CamelCase, kebab-case, and more
- ๐ก๏ธ SSL Flexibility: Automatic SSL handling for local/HTTP endpoints
- ๐ค Encoding Intelligence: Automatic detection and UTF-8 conversion for text files
- ๐งช Comprehensive Testing: 13+ test cases for encoding reliability
๐ ๏ธ Installation
Method 1: Install from Source
# Clone the repository
git clone https://github.com/yourusername/onomatool.git
cd onomatool
# Create virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# For encoding detection (included in requirements.txt)
pip install chardet
# Install the package
pip install -e .
Method 2: Direct Installation
# Install directly from the repository
pip install git+https://github.com/yourusername/onomatool.git
System Dependencies
For full functionality (SVG, PDF, PPTX processing):
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install libreoffice imagemagick libcairo2 libpango-1.0-0 libpangocairo-1.0-0
# macOS (with Homebrew)
brew install libreoffice imagemagick cairo pango
# Windows: Download and install LibreOffice and ImageMagick
# For SVG support: pip install cairosvg (requires Cairo system libraries)
โก Quick Start
Basic Usage
# Rename all PDFs in current directory
onomatool '*.pdf'
# Process files in subdirectories
onomatool 'docs/**/*.md'
# Specify file format explicitly
onomatool '*.unknown' --format pdf
Preview Mode
# See what would be renamed (no changes made)
onomatool '*.jpg' --dry-run
# Interactive confirmation after preview
onomatool '*.pdf' --dry-run --interactive
Debug and Verbose Modes
# Debug mode - preserve temp files
onomatool '*.svg' --debug
# Verbose mode - see LLM interactions
onomatool '*.docx' --verbose
# Combined modes
onomatool '*.pptx' --debug --verbose --dry-run
โ๏ธ Configuration
OnomaTool uses a TOML configuration file at ~/.onomarc or a custom path with --config.
Generate Default Config
onomatool --save-config
Configuration Options
# API Configuration
default_provider = "openai" # or "google"
openai_api_key = "sk-..."
openai_base_url = "https://api.openai.com/v1" # or local endpoint
google_api_key = "your-google-api-key"
# Model and Behavior
llm_model = "gpt-4o" # or "gemini-pro"
naming_convention = "snake_case" # snake_case, CamelCase, kebab-case, etc.
# Custom Prompts (optional - defaults provided)
system_prompt = "You are a file naming assistant."
user_prompt = "Suggest 3 file names for: {content}"
image_prompt = "Suggest 3 file names for this image."
# Markitdown Configuration
[markitdown]
enable_plugins = false
docintel_endpoint = ""
# Word count limits (NEW!)
min_filename_words = 5 # Minimum words required (ensures descriptive names)
max_filename_words = 15 # Maximum words allowed (prevents overly long names)
Supported Naming Conventions
snake_case(default)CamelCasekebab-casePascalCasedot.notationnatural language
๐ Supported File Types
| File Type | Processing Method | Output |
|---|---|---|
| Markitdown + PyMuPDF page images | Combined text + image analysis | |
| PPTX | Markitdown + LibreOffice slide images | Combined text + image analysis |
| SVG | Convert to PNG + Markitdown | Image analysis only |
| Images (JPG, PNG, etc.) | Base64 encoding | Direct image analysis |
| DOCX | Markitdown processing | Text analysis |
| TXT, MD, NOTE | UTF-8 encoding detection + text processing | Text analysis |
| XLSX | Markitdown processing | Content analysis |
| CSV, JSON, XML, HTML | UTF-8 encoding detection + Markitdown | Content analysis |
| Code Files (PY, JS, CSS, YAML) | UTF-8 encoding detection + text processing | Code analysis |
๐งโ๐ป Development
Project Structure
src/onomatool/
โโโ cli.py # Command-line interface
โโโ config.py # Configuration management
โโโ llm_integration.py # OpenAI/Google API integration
โโโ file_dispatcher.py # File routing logic
โโโ processors/ # File processing modules
โ โโโ markitdown_processor.py
โ โโโ text_processor.py
โโโ utils/ # Utility functions
โ โโโ image_utils.py # SVG conversion utilities
โโโ prompts.py # Default prompts
โโโ renamer.py # File renaming logic
โโโ conflict_resolver.py # Filename conflict handling
โโโ file_collector.py # Glob pattern matching
Running Tests
# Install test dependencies
pip install pytest pytest-mock
# Run all tests
pytest
# Run specific test suites
pytest tests/test_usage_enduser.py # End-to-end user tests
pytest tests/test_utf8_encoding.py # UTF-8 encoding tests
# Run with coverage
pytest --cov=onomatool
Code Style
# Format code
ruff format .
# Lint code
ruff check --fix .
# Run all checks
ruff check . && ruff format --check .
๐ง Advanced Usage
Custom Configuration Files
# Use custom config file
onomatool '*.pdf' --config /path/to/custom.toml
Local LLM Endpoints
# In your .onomarc
default_provider = "openai"
openai_base_url = "http://localhost:1234/v1"
openai_api_key = "not-needed-for-local"
Processing Specific Formats
# Force format detection
onomatool 'unknown_files/*' --format pdf
onomatool 'images/*' --format image
๐ก๏ธ Safety Features
- No Overwrites: Built-in conflict resolution with numeric suffixes
- Extension Preservation: Original file extensions always maintained
- Dry-Run Mode: Preview all changes before execution
- Temp File Management: Automatic cleanup (preservable in debug mode)
- Error Handling: Graceful failure with clear error messages
- Encoding Safety: Automatic UTF-8 conversion preserves original files
- Unicode Compatibility: Handles em dashes, smart quotes, accented characters
๐ค Contributing
We welcome contributions! Please:
- Fork the repository ๐ด
- Create a feature branch (
git checkout -b feature/amazing-feature) - Write tests for your changes ๐งช
- Follow PEP8 and run
ruff check --fix .๐ - Update
CHANGELOG.mdandFILETREE.md๐ - Submit a pull request ๐
Development Setup
# Clone and setup development environment
git clone https://github.com/yourusername/onomatool.git
cd onomatool
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .
# Install development dependencies
pip install pytest pytest-mock ruff
# Run tests to verify setup
pytest tests/test_utf8_encoding.py
pytest tests/test_usage_enduser.py
๐ License
MIT License - see LICENSE file for details.
๐ FAQ
Q: Does it work on Windows/Mac/Linux? A: Yes! Cross-platform support with Python 3.10+.
Q: Can I use local LLMs?
A: Yes! Set openai_base_url to your local endpoint in .onomarc.
Q: Will it overwrite my files? A: Never! Built-in conflict resolution prevents overwrites.
Q: What if my API key is invalid? A: The tool will show clear error messages and fail gracefully.
Q: Can I customize the AI prompts?
A: Yes! Set system_prompt, user_prompt, and image_prompt in your config.
Q: How does SVG processing work? A: SVGs are converted to PNG images before AI analysis for better results.
Q: Can I see what the AI is thinking?
A: Use --verbose to see full LLM requests and responses.
Q: What about files with special characters or different encodings? A: OnomaTool automatically detects and converts file encodings to UTF-8, handling em dashes, accented characters, and other Unicode symbols seamlessly.
Q: Does it work with files that have encoding issues? A: Yes! The tool uses advanced encoding detection to identify and convert problematic files while preserving the original content.
Q: How do word count limits affect filename generation? A: Word count limits control the minimum and maximum number of words in generated filenames. This helps maintain descriptive and concise naming conventions.
๐ Star this repo if you find it useful! ๐
Made with โค๏ธ, AI, and a lot of
import os.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file onomatool-0.1.2.tar.gz.
File metadata
- Download URL: onomatool-0.1.2.tar.gz
- Upload date:
- Size: 35.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d049510d3d0ae3e0baaf242f6339e8e0762f3eb30de2eab342bab98266750edf
|
|
| MD5 |
e579e4c84bc280f07c683d9d66e066d2
|
|
| BLAKE2b-256 |
bbcd2065aa63ea0ac9270a9d50dac808b13043329a4bfc4351a65019ee71effb
|
File details
Details for the file onomatool-0.1.2-py3-none-any.whl.
File metadata
- Download URL: onomatool-0.1.2-py3-none-any.whl
- Upload date:
- Size: 29.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d05607a7480ecf542982bf5da43e56c3a53ff3e58d2e2153654ae086e8ae52b
|
|
| MD5 |
3a6038edf00b6b6b42de9df780a5b6ee
|
|
| BLAKE2b-256 |
6eb41ee59b4ed9377116d85f4bf2d19fb5eef33ca81ed981a517c974163674d0
|