Skip to main content

A Python tool that extracts function signatures from large codebases and generates concise summaries for LLM context preparation

Project description

TLDR - Function Signature Extractor

TLDR is a Python tool that extracts function signatures from large codebases and generates concise summaries. It's particularly useful for providing context to Large Language Models (LLMs) when dealing with codebases that exceed their context window limits.

Features

  • Multi-language Support: Supports 40+ programming languages via Pygments lexer integration
  • Signature Extraction: Extracts function, class, and method signatures from code files
  • JSON Output: Produces structured JSON output for easy integration with other tools
  • Recursive Processing: Can process entire directory trees recursively
  • Atomic File Writing: Ensures data integrity with atomic file operations
  • (Optional) AI-Powered File Summaries: Generates file summaries using LLM providers (Claude, OpenAI, Grok)

Supported Languages

JavaScript/TypeScript, Python, Java, C/C++, C#, PHP, Ruby, Go, Rust, Swift, Scala, Kotlin, and many more.

Installation

# Clone the repository
git clone <repository-url>
cd tldr

# Install dependencies
pip install -r requirements.txt

Configuration

TLDR supports multiple LLM providers. Set up your API keys:

# For Claude (Anthropic)
export ANTHROPIC_API_KEY='your-api-key-here'

# For OpenAI
export OPENAI_API_KEY='your-api-key-here'

# For Grok
export GROK_API_KEY='your-api-key-here'

Usage

Basic Usage

# Process a local directory tree to generate a tldr summary file
python tldr_code.py .

# Process a GitHub repository and create a tldr summary file
python tldr_code.py https://github.com/PowerShell/PowerShell 

# Process a GitHub repository and store the downloaded files and the tldr summary file in a specific directory
python tldr_code.py https://github.com/PowerShell/PowerShell /Users/csimoes/repos/PowerShell

Command Line Options

  • directory_path: Path to the directory to scan (required)
  • output_filename: Output filename (optional, defaults to tldr.json)
  • -r, --recursive: Process directories recursively
  • --llm {claude,openai,grok}: LLM provider for generating summaries
  • --skip-file-summary: Skip AI-generated summaries
  • --setup-llm: Show LLM setup instructions

Output Format

TLDR generates JSON files with the following structure:

{
  "directory_path": "/path/to/directory",
  "last_updated": "2025-06-16T10:30:00Z",
  "files": [
    {
      "file_path": "/path/to/file.py",
      "last_scanned": "2025-06-16T10:30:00Z",
      "signatures": [
        "class MyClass(BaseClass)",
        "def __init__(self, param1, param2)",
        "def process_data(self, data: List[str]) -> Dict[str, Any]"
      ],
      "summary": "This file implements data processing functionality..."
    }
  ]
}

For recursive processing, the output includes multiple directories:

{
  "root_directory": "/path/to/project",
  "last_updated": "2025-06-16T10:30:00Z",
  "total_directories_processed": 5,
  "directories": [...]
}

Use Cases

  1. LLM Context Preparation: Quickly generate summaries of large codebases for LLM analysis
  2. Code Documentation: Automatically extract API signatures for documentation
  3. Codebase Analysis: Get an overview of code structure and functionality
  4. Code Review Assistance: Understand code changes and their impact

Architecture

  • TLDRFileCreator: Main orchestrator class
  • SignatureExtractor: Extracts signatures using Pygments lexers
  • LLM Providers: Pluggable AI providers for generating summaries
  • Atomic File Operations: Ensures data integrity during file writes

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tldr_code-0.1.0.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tldr_code-0.1.0-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file tldr_code-0.1.0.tar.gz.

File metadata

  • Download URL: tldr_code-0.1.0.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for tldr_code-0.1.0.tar.gz
Algorithm Hash digest
SHA256 33f6c7dd5378b45ee83a6eb7b2b43e83acd9e3b0b3c41b16d56e535e2b3499d6
MD5 f84c498dc977f8b7ef91b14c1fd47ea7
BLAKE2b-256 fcb4426c0587f38b34f3f0953734e696b6baadd071025f1dd84cfd26ec90fb69

See more details on using hashes here.

File details

Details for the file tldr_code-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tldr_code-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for tldr_code-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e0252ccfc1003650d7e36d1340b37d54712a21f3debedd1ed486659832e0d0ad
MD5 041177f8c0f557eab203466a9862479c
BLAKE2b-256 d870babb386818be8ab919acb2ad61e1a9372adc861b1c3de95b8a3f1a1891fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page