Code archaeologist - reconstruct function decision history via AST-aware lineage tracking
Project description
Archeologist - Semantic Lineage Graph Generator
Post-incident archaeology tool that reconstructs function decision history via deterministic AST-aware lineage tracking.
CLI Usage
# Install
pip install -e .
# Analyze file (auto-detects git repo)
arc analyze path/to/file.py
# Analyze specific function
arc analyze-function path/to/file.py function_name
# With GitHub PR integration
arc analyze-function path/to/file.py function_name --repo owner/repo
# With LLM narrative synthesis
export CLAUDE_API_KEY=sk-ant-xxx
arc analyze-function path/to/file.py function_name
Configuration
Set environment variables:
export GITHUB_TOKEN=ghp_xxx
export CLAUDE_API_KEY=sk-ant-xxx
export GIT_REPO_PATH=/path/to/local/repo
Architecture
Three-phase pipeline:
-
Semantic Lineage Tracking
- GitWalker traverses history (--no-renames flag)
- ASTParser extracts function boundaries (Python, JS, TS, Go, Rust, Java, C, C++, Ruby, PHP)
- LineageTracker links nodes via four-tier hierarchy
-
Contextual Slicing
- PRFetcher pulls associated PRs
- Geographic filter maps review comments to AST node line ranges
-
Narrative Synthesis
- LiteLLM abstracts LLM calls (Claude, local models)
- Outputs 5-sentence brief explaining decisions
MCP Server
The tool exposes an MCP-compatible JSON-RPC 2.0 server over stdio:
# Run MCP server
arc-mcp
# Or run directly
python -m src.mcp.server
Available Methods
// List functions in a file
{"jsonrpc": "2.0", "id": 1, "method": "list_functions", "params": {"file_path": "/path/to/file.py"}}
// Analyze a specific function
{"jsonrpc": "2.0", "id": 2, "method": "analyze_function", "params": {"file_path": "/path/to/file.py", "function_name": "foo"}}
// Analyze a file's overall lineage
{"jsonrpc": "2.0", "id": 3, "method": "analyze_file", "params": {"file_path": "/path/to/file.py"}}
Example
# Analyze the `authenticate` function in a FastAPI project
GIT_REPO_PATH=/Users/fuads/fastapi arc analyze-function app/auth.py authenticate --repo fastapi/fastapi
Output:
Analyzing function authenticate in app/auth.py...
Found 12 lineage edges for authenticate
Summary: Found 12 historical versions of this code. Change types: physical: 8, identity: 4
With LLM synthesis:
The authenticate function evolved through 12 commits over 18 months.
Initial implementation used simple token validation, replaced in PR #2341
with OAuth2 Bearer token parsing after security audit. Several performance
optimizations were attempted (PRs #1892, #2103) but reverted due to race
conditions. The current implementation handles both JWT and opaque tokens
with a unified interface, consolidating three previous approaches.
Testing
pytest tests/
Roadmap
- Graph construction (GitWalker, ASTParser, LineageTracker)
- Contextual slicing (PRFetcher, GeographicFilter)
- CLI commands with local git repo auto-detection
- 10 language support (Python, JS, TS, Go, Rust, Java, C, C++, Ruby, PHP)
- Real-world testing on Flask repo
- MCP server (JSON-RPC 2.0 over stdio, works with Python 3.9)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file archeologist-0.1.3.tar.gz.
File metadata
- Download URL: archeologist-0.1.3.tar.gz
- Upload date:
- Size: 21.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06f354c23aa211f1222565410520a44139501010f30000eaa4cc1ab8d89386b1
|
|
| MD5 |
7c5689dd4dd646440fb1e60b06bc820b
|
|
| BLAKE2b-256 |
7960daac17591d2be1b0aa5493f94fa71817b658480e58737ee0d868681f54fa
|
File details
Details for the file archeologist-0.1.3-py3-none-any.whl.
File metadata
- Download URL: archeologist-0.1.3-py3-none-any.whl
- Upload date:
- Size: 22.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f662cb6c3cfaff37ccc88108aff6420018f527657b9d671fa323679dcdbee57
|
|
| MD5 |
090612b17f8787e654bfae48b4817125
|
|
| BLAKE2b-256 |
d56dd96d4eb44f4140e2badff7997498c449183dfee0a588e8e25246a38d3df0
|