Repository maps for LLMs
Project description
RepoScape
RepoScape
RepoScape is a Python library for mapping and analyzing repository structures with a focus on understanding code dependencies and importance. It parses code files, builds a graph representation, and helps identify important components through various scoring algorithms.
Installation
pip install reposcape
Requires Python 3.12 or higher.
Quick Start
from reposcape import RepoMapper, DetailLevel
# Create mapper with default settings
mapper = RepoMapper()
# Generate overview of entire repository
overview = mapper.create_overview(
repo_path="path/to/repo",
detail=DetailLevel.SIGNATURES,
token_limit=2000 # Optional token limit for output
)
# Generate focused view of specific files
focused = mapper.create_focused_view(
files=["main.py", "utils.py"],
repo_path="path/to/repo",
detail=DetailLevel.DOCSTRINGS
)
Core Components
RepoMapper
The main entry point for repository analysis. Configurable with custom analyzers, scorers, and serializers.
class RepoMapper:
def __init__(
self,
*,
analyzers: Sequence[CodeAnalyzer] | None = None,
scorer: GraphScorer | None = None,
serializer: CodeSerializer | None = None,
): ...
def create_overview(
self,
repo_path: str | PathLike[str],
*,
token_limit: int | None = None,
detail: DetailLevel = DetailLevel.SIGNATURES,
exclude_patterns: list[str] | None = None,
) -> str: ...
def create_focused_view(
self,
files: Sequence[str | PathLike[str]],
repo_path: str | PathLike[str],
*,
token_limit: int | None = None,
detail: DetailLevel = DetailLevel.SIGNATURES,
exclude_patterns: list[str] | None = None,
) -> str: ...
Detail Levels
Control how much information is included in the output:
class DetailLevel(Enum):
STRUCTURE # Just names and hierarchy
SIGNATURES # Include function/class signatures
DOCSTRINGS # Include signatures + docstrings
FULL_CODE # Include complete implementations
Code Analysis
RepoScape includes analyzers for different file types:
PythonAstAnalyzer
Analyzes Python files using AST parsing:
- Extracts classes, functions, methods, variables
- Tracks references between symbols
- Collects docstrings and signatures
analyzer = PythonAstAnalyzer()
nodes = analyzer.analyze_file("main.py")
TextAnalyzer
Basic analyzer for text files:
- Handles .txt, .md, .rst files
- Extracts sections from markdown files
- Preserves file content and first paragraph as docstring
Importance Scoring
RepoScape offers different algorithms for calculating code importance:
ReferenceScorer
Simple reference-based scoring that considers:
- Number of incoming references (highest weight)
- Number of outgoing references (medium weight)
- Being referenced by important files (high boost)
- Distance from important files (decreasing boost)
from reposcape.importance import ReferenceScorer
scorer = ReferenceScorer(
ref_weight=1.0,
outref_weight=0.5,
important_ref_boost=2.0,
distance_decay=0.5,
)
PageRankScorer
Uses the PageRank algorithm to score nodes based on the graph structure:
- Considers connection patterns
- Handles cycles in dependencies
- Supports personalization for focused analysis
from reposcape.importance import PageRankScorer
scorer = PageRankScorer()
Output Serialization
Multiple serializers are available for different output formats:
MarkdownSerializer
Generates detailed markdown with:
- Hierarchical structure using headers
- Code blocks for signatures/implementations
- Emojis for different node types
- Optional details based on importance scores
CompactSerializer
Produces a compact, indented format:
- Single line per node
- Indentation shows hierarchy
- Abbreviated signatures
- Good for quick overviews
TreeSerializer
ASCII tree-style output:
- Uses box-drawing characters
- Shows clear parent-child relationships
- Similar to
treecommand output
Example usage:
from reposcape.serializers import MarkdownSerializer, CompactSerializer, TreeSerializer
# Create mapper with specific serializer
mapper = RepoMapper(serializer=TreeSerializer())
Advanced Usage
Custom Analyzers
Implement CodeAnalyzer for custom file analysis:
class CustomAnalyzer(CodeAnalyzer):
def can_handle(self, path: str | PathLike[str]) -> bool:
return path.endswith(".custom")
def analyze_file(
self,
path: str | PathLike[str],
content: str | None = None
) -> list[CodeNode]: ...
Focused Analysis
Analyze specific files and their relationships:
mapper = RepoMapper()
# Focus on specific files
focused_view = mapper.create_focused_view(
files=["src/core.py", "src/utils.py"],
repo_path=".",
detail=DetailLevel.DOCSTRINGS,
exclude_patterns=["**/test_*.py", "**/__pycache__/*"]
)
Token Limits
Control output size for large repositories:
# Limit output to approximately 2000 tokens
overview = mapper.create_overview(
repo_path=".",
token_limit=2000,
detail=DetailLevel.SIGNATURES
)
Models
CodeNode
Immutable representation of code elements:
@dataclass(frozen=True)
class CodeNode:
name: str
node_type: NodeType
path: str
content: str | None = None
docstring: str | None = None
signature: str | None = None
children: Mapping[str, CodeNode] | None = None
references_to: Sequence[Reference] | None = None
referenced_by: Sequence[Reference] | None = None
importance: float = 0.0
NodeType
Available node types:
- DIRECTORY
- FILE
- CLASS
- FUNCTION
- METHOD
- VARIABLE
Reference
Tracks symbol references:
@dataclass(frozen=True)
class Reference:
name: str
path: str
line: int
column: int
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reposcape-0.1.0.tar.gz.
File metadata
- Download URL: reposcape-0.1.0.tar.gz
- Upload date:
- Size: 29.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00f71035e7234694944f38017fb5d5eeae5a3ab6e8501f232c3c76ad204b945d
|
|
| MD5 |
e6fdf3e1f89b90ef88f4e2a26627b346
|
|
| BLAKE2b-256 |
c4431c5b7515a760e000bd39dd9ff66d074f9c9de0ea6ad95e2f12e5cfb44d38
|
File details
Details for the file reposcape-0.1.0-py3-none-any.whl.
File metadata
- Download URL: reposcape-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d8ca21520c039411fc241afa2433cc6dc8198debb70b196b9e9ad1fa745a7a16
|
|
| MD5 |
36b07fc98c04f0fc5d56f03877349c0d
|
|
| BLAKE2b-256 |
760d2fb5e157568ba3dc5ad1bcee281c17c6a1bf1b69d4b0afef7e816bc29dd6
|