Token-efficient code analysis for LLMs. 5-layer stack: AST, Call Graph, CFG, DFG, PDG. 95% token savings. 17 languages.

These details have not been verified by PyPI

Project links

Project description

TLDR: Code Analysis for AI Agents

Give LLMs exactly the code they need. Nothing more.

# One-liner: Install, index, search
pip install llm-tldr && tldr warm . && tldr semantic "what you're looking for" .

Your codebase is 100K lines. Claude's context window is 200K tokens. Raw code won't fit—and even if it did, the LLM would drown in irrelevant details.

TLDR extracts structure instead of dumping text. The result: 95% fewer tokens while preserving everything needed to understand and edit code correctly.

pip install llm-tldr
tldr warm .                    # Index your project
tldr context main --project .  # Get LLM-ready summary

How It Works

TLDR builds 5 analysis layers, each answering different questions:

┌─────────────────────────────────────────────────────────────┐
│ Layer 5: Program Dependence  → "What affects line 42?"      │
│ Layer 4: Data Flow           → "Where does this value go?"  │
│ Layer 3: Control Flow        → "How complex is this?"       │
│ Layer 2: Call Graph          → "Who calls this function?"   │
│ Layer 1: AST                 → "What functions exist?"      │
└─────────────────────────────────────────────────────────────┘

Why layers? Different tasks need different depth:

Browsing code? Layer 1 (structure) is enough
Refactoring? Layer 2 (call graph) shows what breaks
Debugging null? Layer 5 (slice) shows only relevant lines

The daemon keeps indexes in memory for 100ms queries instead of 30-second CLI spawns.

Architecture

┌──────────────────────────────────────────────────────────────────┐
│                         YOUR CODE                                │
│  src/*.py, lib/*.ts, pkg/*.go                                    │
└───────────────────────────┬──────────────────────────────────────┘
                            │ tree-sitter
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│                     5-LAYER ANALYSIS                             │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐     │
│  │   AST   │→│  Calls  │→│   CFG   │→│   DFG   │→│   PDG   │     │
│  │   L1    │ │   L2    │ │   L3    │ │   L4    │ │   L5    │     │
│  └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘     │
└───────────────────────────┬──────────────────────────────────────┘
                            │ bge-large-en-v1.5
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│                    SEMANTIC INDEX                                │
│  1024-dim embeddings in FAISS  →  "find JWT validation"          │
└───────────────────────────┬──────────────────────────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│                       DAEMON                                     │
│  In-memory indexes  •  100ms queries  •  Auto-lifecycle          │
└──────────────────────────────────────────────────────────────────┘

The Semantic Layer: Search by Behavior

The real power comes from combining all 5 layers into searchable embeddings.

Every function gets indexed with:

Signature + docstring (L1)
What it calls + who calls it (L2)
Complexity metrics (L3)
Data flow patterns (L4)
Dependencies (L5)
First ~10 lines of actual code

This gets encoded into 1024-dimensional vectors using bge-large-en-v1.5. The result: search by what code does, not just what it says.

# "validate JWT" finds verify_access_token() even without that exact text
tldr semantic "validate JWT tokens and check expiration" .

Why this works: Traditional search finds authentication in variable names and comments. Semantic search understands that verify_access_token() performs JWT validation because the call graph and data flow reveal its purpose.

Setting Up Semantic Search

# Build the semantic index (one-time, ~2 min for typical project)
tldr warm /path/to/project

# Search by behavior
tldr semantic "database connection pooling" .

Embedding dependencies (sentence-transformers, faiss-cpu) are included with pip install llm-tldr. The index is cached in .tldr/cache/semantic.faiss.

Keeping the Index Fresh

The daemon tracks dirty files and auto-rebuilds after 20 changes, but you need to notify it when files change:

# Notify daemon of a changed file
tldr daemon notify src/auth.py --project .

Integration options:

Git hook (post-commit):

git diff --name-only HEAD~1 | xargs -I{} tldr daemon notify {} --project .

Editor hook (on save):
```
tldr daemon notify "$FILE" --project .
```
Manual rebuild (when needed):
```
tldr warm .  # Full rebuild
```

The daemon auto-rebuilds semantic embeddings in the background once the dirty threshold (default: 20 files) is reached.

The Workflow

Before Reading Code

tldr tree src/                      # See file structure
tldr structure src/ --lang python   # See functions/classes

Before Editing

tldr extract src/auth.py            # Full file analysis
tldr context login --project .      # LLM-ready summary (95% savings)

Before Refactoring

tldr impact login .                 # Who calls this? (reverse call graph)
tldr change-impact                  # Which tests need to run?

Debugging

tldr slice src/auth.py login 42     # What affects line 42?
tldr dfg src/auth.py login          # Trace data flow

Finding Code by Behavior

tldr semantic "validate JWT tokens" .   # Natural language search

Quick Setup

1. Install

pip install llm-tldr

2. Index Your Project

tldr warm /path/to/project

This builds all analysis layers and starts the daemon. Takes 30-60 seconds for a typical project, then queries are instant.

3. Start Using

tldr context main --project .   # Get context for a function
tldr impact helper_func .       # See who calls it
tldr semantic "error handling"  # Find by behavior

Real Example: Why This Matters

Scenario: Debug why user is null on line 42.

Without TLDR:

Read the 150-line function
Trace every variable manually
Miss the bug because it's hidden in control flow

With TLDR:

tldr slice src/auth.py login 42

Output: Only 6 lines that affect line 42:

3:   user = db.get_user(username)
7:   if user is None:
12:      raise NotFound
28:  token = create_token(user)  # ← BUG: skipped null check
35:  session.token = token
42:  return session

The bug is obvious. Line 28 uses user without going through the null check path.

Command Reference

Exploration

Command	What It Does
`tldr tree [path]`	File tree
`tldr structure [path] --lang <lang>`	Functions, classes, methods
`tldr search <pattern> [path]`	Text pattern search
`tldr extract <file>`	Full file analysis

Analysis

Command	What It Does
`tldr context <func> --project <path>`	LLM-ready summary (95% savings)
`tldr cfg <file> <function>`	Control flow graph
`tldr dfg <file> <function>`	Data flow graph
`tldr slice <file> <func> <line>`	Program slice

Cross-File

Command	What It Does
`tldr calls [path]`	Build call graph
`tldr impact <func> [path]`	Find all callers (reverse call graph)
`tldr dead [path]`	Find unreachable code
`tldr arch [path]`	Detect architecture layers
`tldr imports <file>`	Parse imports
`tldr importers <module> [path]`	Find files that import a module

Semantic

Command	What It Does
`tldr warm <path>`	Build all indexes (including embeddings)
`tldr semantic <query> [path]`	Natural language code search

Diagnostics

Command	What It Does
`tldr diagnostics <file>`	Type check + lint
`tldr change-impact [files]`	Find tests affected by changes
`tldr doctor`	Check/install diagnostic tools

Daemon

Command	What It Does
`tldr daemon start`	Start background daemon
`tldr daemon stop`	Stop daemon
`tldr daemon status`	Check status

Supported Languages

Python, TypeScript, JavaScript, Go, Rust, Java, C, C++, Ruby, PHP, C#, Kotlin, Scala, Swift, Lua, Elixir

Language is auto-detected or specify with --lang.

MCP Integration

For AI tools (Claude Desktop, Claude Code):

Claude Desktop - Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "tldr": {
      "command": "tldr-mcp",
      "args": ["--project", "/path/to/your/project"]
    }
  }
}

Claude Code - Add to .claude/settings.json:

{
  "mcpServers": {
    "tldr": {
      "command": "tldr-mcp",
      "args": ["--project", "."]
    }
  }
}

Configuration

`.tldrignore` - Exclude Files

TLDR respects .tldrignore (gitignore syntax) for all commands including tree, structure, search, calls, and semantic indexing:

# Auto-create with sensible defaults
tldr warm .  # Creates .tldrignore if missing

Default exclusions:

node_modules/, .venv/, __pycache__/
dist/, build/, *.egg-info/
Binary files (*.so, *.dll, *.whl)
Security files (.env, *.pem, *.key)

Customize by editing .tldrignore:

# Add your patterns
large_test_fixtures/
vendor/
data/*.csv

CLI Flags:

# Add patterns from command line (can be repeated)
tldr --ignore "packages/old/" --ignore "*.generated.ts" tree .

# Bypass all ignore patterns
tldr --no-ignore tree .

Settings - Daemon Behavior

Create .tldr/config.json for daemon settings:

{
  "semantic": {
    "enabled": true,
    "auto_reindex_threshold": 20
  }
}

Setting	Default	Description
`enabled`	`true`	Enable semantic search
`auto_reindex_threshold`	`20`	Files changed before auto-rebuild

Monorepo Support

For monorepos, create .claude/workspace.json to scope indexing:

{
  "active_packages": ["packages/core", "packages/api"],
  "exclude_patterns": ["**/fixtures/**"]
}

Performance

Metric	Raw Code	TLDR	Improvement
Tokens for function context	21,000	175	99% savings
Tokens for codebase overview	104,000	12,000	89% savings
Query latency (daemon)	30s	100ms	300x faster

Deep Dive

For the full architecture explanation, benchmarks, and advanced workflows:

Full Documentation

License

Apache 2.0 - See LICENSE file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.5.2

Jan 14, 2026

1.5.1

Jan 14, 2026

1.5.0

Jan 14, 2026

This version

1.4.1

Jan 14, 2026

1.4.0

Jan 14, 2026

1.3.1

Jan 14, 2026

1.3.0

Jan 14, 2026

1.2.7

Jan 14, 2026

1.2.6

Jan 13, 2026

1.2.5

Jan 13, 2026

1.2.4

Jan 13, 2026

1.2.3

Jan 13, 2026

1.2.2

Jan 12, 2026

1.2.1

Jan 12, 2026

1.2.0

Jan 12, 2026

1.1.4

Jan 12, 2026

1.1.3

Jan 11, 2026

1.1.2

Jan 11, 2026

1.1.1

Jan 11, 2026

1.1.0

Jan 11, 2026

1.0.7

Jan 11, 2026

1.0.6

Jan 11, 2026

1.0.5

Jan 11, 2026

1.0.4

Jan 11, 2026

1.0.3

Jan 11, 2026

1.0.2

Jan 10, 2026

1.0.1

Jan 10, 2026

1.0.0

Jan 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_tldr-1.4.1.tar.gz (207.4 kB view details)

Uploaded Jan 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_tldr-1.4.1-py3-none-any.whl (199.7 kB view details)

Uploaded Jan 14, 2026 Python 3

File details

Details for the file llm_tldr-1.4.1.tar.gz.

File metadata

Download URL: llm_tldr-1.4.1.tar.gz
Upload date: Jan 14, 2026
Size: 207.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.17

File hashes

Hashes for llm_tldr-1.4.1.tar.gz
Algorithm	Hash digest
SHA256	`c85f22e647e3a7eaf5af23966d55014e6eeaa60a8023d5929c422473b8d08c5c`
MD5	`e1c4650c95c8a55ed39e816379532890`
BLAKE2b-256	`5fe1eeea11d8534b62fbfd59258d2706aa06e5cfe95bef144892682fac4dae2e`

See more details on using hashes here.

File details

Details for the file llm_tldr-1.4.1-py3-none-any.whl.

File metadata

Download URL: llm_tldr-1.4.1-py3-none-any.whl
Upload date: Jan 14, 2026
Size: 199.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.17

File hashes

Hashes for llm_tldr-1.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`772ce90cd3e1bf24d205c634bc4024b1cc15ff8fe46726e520a53d31cbfe1396`
MD5	`c138b2e987c0a04ba051448b6350aeb0`
BLAKE2b-256	`dc45cd7f699dd933034e9ce3969b4d41d78e28b0239a8c35720a518d44e38a57`

See more details on using hashes here.

llm-tldr 1.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TLDR: Code Analysis for AI Agents

How It Works

Architecture

The Semantic Layer: Search by Behavior

Setting Up Semantic Search

Keeping the Index Fresh

The Workflow

Before Reading Code

Before Editing

Before Refactoring

Debugging

Finding Code by Behavior

Quick Setup

1. Install

2. Index Your Project

3. Start Using

Real Example: Why This Matters

Command Reference

Exploration

Analysis

Cross-File

Semantic

Diagnostics

Daemon

Supported Languages

MCP Integration

Configuration

.tldrignore - Exclude Files

Settings - Daemon Behavior

Monorepo Support

Performance

Deep Dive

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`.tldrignore` - Exclude Files