Skip to main content

A Python package that optimizes codebase representations for LLMs by generating compact, context-rich summaries

Project description

repominify

A Python package that optimizes codebase representations for Large Language Models (LLMs) by generating compact, context-rich summaries that minimize token usage while preserving essential structural information.

PyPI version License: MIT PyPI Downloads Python Versions

Overview

repominify helps you provide detailed context about your codebase to LLMs without consuming excessive space in their context windows. It processes Repomix output to create optimized representations that maintain critical structural information while significantly reducing token usage. This enables more efficient and effective code-related conversations with AI models by maximizing the amount of useful context that can fit within token limits.

โš ๏ธ Warning: repominify performs deep code analysis which can be resource-intensive for large codebases. Please start with a small subset of your code to understand the process and resource requirements.

Features

  • Automatic Dependency Management

    • Checks and installs Node.js and npm dependencies
    • Automatically installs Repomix if not present
    • Handles version compatibility checks
  • Code Analysis

    • Parses and analyzes code structure
    • Extracts imports, classes, and functions
    • Captures function signatures and docstrings
    • Identifies and extracts constants and environment variables
    • Builds dependency graphs
    • Performance optimized for large codebases
  • Multiple Output Formats

    • GraphML for visualization tools
    • JSON for web-based tools
    • YAML for statistics
    • Text for human-readable analysis
  • Rich Code Context

    • Complete function/method signatures
    • Full docstrings with parameter descriptions
    • Constants and their values
    • Environment variables and configurations
    • Module-level documentation
    • Import relationships
    • Class hierarchies and dependencies
  • Size Optimization

    • Generates minified code structure representation
    • Provides detailed size reduction statistics
    • Shows character and token reduction percentages
    • Maintains semantic meaning while reducing size
  • Security Awareness

    • Detects potentially sensitive patterns
    • Provides security recommendations
    • Flags suspicious file content
    • Helps maintain security best practices
  • Debug Support

    • Comprehensive logging
    • Performance tracking
    • Detailed error messages

Installation

pip install repominify

Requirements

  • Python 3.7 or higher
  • Node.js 12+ (will be checked during runtime)
  • npm 6+ (will be checked during runtime)
  • Repomix (will be installed automatically if not present)

Usage

Command Line

# Basic usage
repominify path/to/repomix-output.txt

# Specify output directory
repominify path/to/repomix-output.txt -o output_dir

# Enable debug logging
repominify path/to/repomix-output.txt --debug

Python API

from repominify import CodeGraphBuilder, ensure_dependencies, configure_logging

# Enable debug logging (optional)
configure_logging(debug=True)

# Check dependencies
if ensure_dependencies():
    # Create graph builder
    builder = CodeGraphBuilder()
    
    # Parse the Repomix output file
    file_entries = builder.parser.parse_file("repomix-output.txt")
    
    # Build the graph
    graph = builder.build_graph(file_entries)
    
    # Save outputs and get comparison
    text_content, comparison = builder.save_graph(
        "output_directory",
        input_file="repomix-output.txt"
    )
    
    # Print comparison
    print(comparison)

Example Output

Analysis Complete!
๐Ÿ“Š File Stats:
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  Total Files: 29
  Total Chars: 143,887
 Total Tokens: 14,752
       Output: input.txt
     Security: โœ” No suspicious files detected

๐Ÿ“Š File Stats:
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  Total Files: 29
  Total Chars: 26,254
 Total Tokens: 3,254
       Output: code_graph.txt
     Security: โœ” No suspicious files detected

๐Ÿ“ˆ Comparison:
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
 Char Reduction: 81.8%
Token Reduction: 77.9%
Security Notes: โœ” No issues found

Output Files

When you run repominify, it generates several files in your output directory:

  • code_graph.graphml: Graph representation in GraphML format
  • code_graph.json: Graph data in JSON format for web visualization
  • graph_statistics.yaml: Statistical analysis of the codebase
  • code_graph.txt: Human-readable text representation including:
    • Module structure and dependencies
    • Function signatures and docstrings
    • Class definitions and hierarchies
    • Constants and their values
    • Environment variables
    • Import relationships

Project Structure

repominify/
โ”œโ”€โ”€ repominify/         # Source code
โ”‚   โ”œโ”€โ”€ graph.py        # Graph building and analysis
โ”‚   โ”œโ”€โ”€ parser.py       # Repomix file parsing
โ”‚   โ”œโ”€โ”€ types.py        # Core types and data structures
โ”‚   โ”œโ”€โ”€ exporters.py    # Graph export functionality
โ”‚   โ”œโ”€โ”€ formatters.py   # Text representation formatting
โ”‚   โ”œโ”€โ”€ dependencies.py # Dependency management
โ”‚   โ”œโ”€โ”€ logging.py      # Logging configuration
โ”‚   โ”œโ”€โ”€ stats.py        # Statistics and comparison
โ”‚   โ”œโ”€โ”€ constants.py    # Shared constants
โ”‚   โ”œโ”€โ”€ exceptions.py   # Custom exceptions
โ”‚   โ”œโ”€โ”€ cli.py         # Command-line interface
โ”‚   โ””โ”€โ”€ __init__.py    # Package initialization
โ”œโ”€โ”€ tests/             # Test suite
โ”‚   โ”œโ”€โ”€ test_end2end.py # End-to-end tests
โ”‚   โ””โ”€โ”€ data/          # Test data files
โ”œโ”€โ”€ setup.py          # Package configuration
โ”œโ”€โ”€ LICENSE           # MIT License
โ””โ”€โ”€ README.md         # This file

Code Style

The project follows these coding standards for consistency and maintainability:

  • Comprehensive docstrings with Examples sections for all public APIs
  • Type hints for all functions, methods, and class attributes
  • Custom exceptions for proper error handling and reporting
  • Clear separation of concerns between modules
  • Consistent code formatting and naming conventions
  • Detailed logging with configurable debug support

Development

To set up for development:

# Clone the repository
git clone https://github.com/mikewcasale/repominify.git
cd repominify

# Install in development mode with test dependencies
pip install -e '.[dev]'

# Run tests
pytest tests/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. By contributing to this project, you agree to abide by its terms.

Please ensure your code follows the project's coding standards, including proper docstrings, type hints, and error handling.

Authors

Mike Casale

License

MIT License - see the LICENSE file for details.

Acknowledgments

This project makes use of or was influenced by several excellent open source projects:

  • Repomix - Our analysis pipeline integrates with this Node.js tool for initial code scanning
  • NetworkX - Core graph algorithms and data structures
  • PyYAML - YAML file handling
  • GraphRAG Accelerator - Graph-based code analysis patterns and implementation concepts

How to Get Help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repominify-0.1.6.tar.gz (28.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repominify-0.1.6-py3-none-any.whl (30.5 kB view details)

Uploaded Python 3

File details

Details for the file repominify-0.1.6.tar.gz.

File metadata

  • Download URL: repominify-0.1.6.tar.gz
  • Upload date:
  • Size: 28.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for repominify-0.1.6.tar.gz
Algorithm Hash digest
SHA256 3551312eaaf636016ac9c0e66f1608ebdb004af63ea1522739b1da0284191b8c
MD5 3d725e16899be5ec854094669d36737b
BLAKE2b-256 257ca55df9d47f377dbdb47b4200855a4c5ca2aefdcb32772f88a15a0ba34b68

See more details on using hashes here.

File details

Details for the file repominify-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: repominify-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 30.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for repominify-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b78367755578c84165c25bc457295f75d462762376ab1a825a59a5200c55c7d0
MD5 3dd4807cbae8fec017b1accd89238367
BLAKE2b-256 28040289751ee2bcb9a3592380b83493c4faeaff84594b10cf4ca8979c1bada0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page