Skip to main content

Give your AI a complete map of any codebase — function signatures, call graphs, and semantic search

Project description

Terrain

English | Chinese / CN

CI PyPI npm License Python Node

Give your AI coding assistant a complete map of any codebase — function signatures, call graphs, and semantic search across every line of code.

The Problem

You drop a 500,000-line codebase in front of Claude Code. It reads what it can see. It guesses what it can't. You get answers that are almost right.

Terrain indexes the entire codebase once, then gives your AI a precise, queryable knowledge graph — so it stops guessing.

What This Looks Like

Ask Claude Code about an unfamiliar codebase:

"How does the authentication token get refreshed?"

Without Terrain, the AI skims files and makes educated guesses — possibly missing the real implementation buried three call levels deep.

With Terrain:

find_api("authentication token refresh")

→ refresh_access_token() in auth/token_manager.c:187
  Signature: int refresh_access_token(TokenCtx *ctx, const char *refresh_token)
  Called by: session_heartbeat() → event_loop_tick() → main()
  Calls:     http_post(), parse_jwt(), update_session_store()

Precise. Complete. Instant.


Full Installation

Install the npm package

The npm package provides the CLI wrapper and MCP server launcher:

npm install -g terrain-ai@latest

Install the Python package (PyPI)

The Python package provides the core indexing engine, graph database, and all language parsers.

Core installation (includes C/C++, Python, JavaScript/TypeScript grammars):

pip install terrain-ai

Quick Start

npx terrain-ai@latest --setup

The setup wizard installs the Python package, configures your LLM and embedding provider, and registers Terrain as a global MCP server for Claude Code. One command.

Then in Claude Code — point it at any repo and ask anything.

Index a Codebase

terrain index /path/to/your/repo

Takes a few minutes the first time. Incremental updates after that:

terrain index -i   # git-diff based, fast

What You Can Ask

You want to know... What to ask
Where does X get initialized? "find where X is initialized"
What calls this function? "find callers of function_name"
How does feature Y work end-to-end? "trace the call chain for Y"
What functions handle Z? "find Z handler"

Supported Languages

C/C++, Python, JavaScript/TypeScript, Rust, Go, Java, Scala, C#, PHP, Lua


Reference

Uninstall

npx terrain-ai@latest --uninstall

Removes: Claude MCP registration, Python package, workspace data.

CLI Tool (terrain)

Workspace

terrain status              # Show active repository, workspace, LLM & embedding info
terrain list                # List all indexed repositories
terrain repo                # Interactively switch active repository
terrain config              # Interactive configuration wizard (LLM, embedding, workspace)
terrain link <path>         # Link a local repo to shared pre-built artifacts
terrain link <path> --db x  # Link to a specific artifact directory

Indexing

terrain index               # Index current directory (graph → api-docs → embeddings)
terrain index /path/to/repo # Index a specific path
terrain index -i            # Incremental update (git-diff based, fast)
terrain index --no-embed    # Skip embedding generation
terrain index --no-wiki     # Skip wiki generation only

Rebuild & Clean

terrain rebuild             # Rebuild all steps for active repository
terrain rebuild --step graph   # Rebuild only the graph
terrain rebuild --step api     # Rebuild only API docs
terrain rebuild --step embed   # Rebuild only embeddings
terrain rebuild --step wiki    # Rebuild only wiki

terrain clean               # Remove indexed data (interactive)
terrain clean repo_name     # Remove specific repository
terrain clean --all         # Remove all indexed repositories

Low-Level Commands

terrain scan /path          # Scan repo and build knowledge graph
  --backend kuzu|memgraph|memory
  --db-path ./graph.db
  --exclude "vendor,build"
  --language "c,python"
  --clean               # Clean DB before scanning
  -o graph.json         # Export graph to JSON

terrain query "MATCH (f:Function) RETURN f.name LIMIT 10"
  --format table|json

terrain export /path -o graph.json
  --build               # Build graph before exporting

terrain stats               # Show graph statistics (nodes, relationships)

Global Flags

terrain --version           # Show version
terrain -v ...              # Verbose/debug output
terrain --help              # Show help

MCP Tools

Core workflow for AI agents: initialize_repositoryfind_apiget_api_doc

Repository Management

Tool Description
initialize_repository Index a repo: graph + API docs + embeddings
get_repository_info Active repo stats (node/relationship counts, service status)
list_repositories All indexed repos with pipeline completion status
switch_repository Switch active repo for queries
link_repository Reuse existing index for a different repo path (no re-indexing)

Code Search & Documentation

Tool Description
find_api Hybrid semantic + keyword search with API doc (primary search tool)
list_api_docs Browse L1 module index or L2 module details
get_api_doc L3 function detail: signature, call tree, usage examples, source
generate_api_docs Generate/update API docs (full / resume / enhance)

Call Graph Analysis

Tool Description
find_callers Find all functions that call a specific function (no LLM required)
trace_call_chain BFS upward call chain trace with entry point discovery

Configuration & Maintenance

Tool Description
get_config Show server configuration and service availability
rebuild_embeddings Build or rebuild vector embeddings

Pipeline

Step What Input Output
1. graph-build Tree-sitter AST parsing Source code Kuzu graph database
2. api-doc-gen Query graph, render docs Graph 3-level Markdown (index / module / function)
2b. desc-gen LLM generates descriptions Functions without docstrings Descriptions in L3 Markdown
3. embed-gen Vectorize function docs L3 Markdown files Vector store (pickle)
initialize_repository  ->  Steps 1-3 (full pipeline)
build_graph            ->  Step 1 only
generate_api_docs      ->  Step 2 + 2b (modes: full / resume / enhance)
rebuild_embeddings     ->  Step 3
generate_wiki          ->  Separate (not in main pipeline)

API Documentation Format

Generated docs are optimized for both AI agent reading and vector retrieval.

L3 Function Detail (embedding unit)

# parse_btype

> Parse base type declaration including struct/union/enum specifiers.

- Signature: `int parse_btype(CType *type, AttributeDef *ad, int ignore_label)`
- Return: `int`
- Visibility: static | Header: tccgen.h
- Location: tccgen.c:139-280
- Module: tinycc.tccgen --C code generator

## Call Tree

parse_btype
|-- expr_const           [static]
|-- parse_btype_qualify   [static]
|-- struct_decl           [static]
|   |-- expect
|   `-- next
`-- parse_attribute       [static]

## Called by (5)

- type_decl (tinycc.tccgen) -> tccgen.c:1200
- post_type (tinycc.tccgen) -> tccgen.c:1350

C/C++ Specific Features

  • Extracts // and /* */ comments above functions as descriptions
  • Struct/union/enum members displayed with types
  • Macro definitions in dedicated section
  • Static/public/extern visibility classification
  • Memory ownership inference from signatures
  • Header/implementation file split
  • Cross-file function call resolution via #include header mapping
  • Function pointer tracking and indirect call resolution
  • GB2312/GBK encoding support for source files

Supported Languages (detail)

Language Functions Classes/Structs Calls Imports Types
C / C++ Yes struct, union, enum, typedef, macro Yes #include Yes
Python Yes Yes Yes Yes -
JavaScript / TypeScript Yes Yes Yes Yes -
Rust Yes struct, enum, trait, impl Yes Yes -
Go Yes struct, interface Yes Yes -
Java Yes class, interface, enum Yes Yes -
Scala Yes class, object Yes Yes -
C# Yes class, namespace Yes - -
PHP Yes class Yes - -
Lua Yes - Yes - -

Graph Schema

Nodes: Project, Package, Module, File, Folder, Class, Function, Method, Type, Enum, Union

Relationships: CONTAINS_*, DEFINES, DEFINES_METHOD, CALLS, INHERITS, IMPLEMENTS, IMPORTS, OVERRIDES

Properties: qualified_name (PK), name, path, start_line, end_line, signature, return_type, visibility, parameters, kind, docstring

Architecture

The project follows a 5-layer harness architecture:

L4  entrypoints/         MCP server, CLI
L3  domains/upper/       apidoc, rag, guidance, calltrace
L2  domains/core/        graph, embedding, search
L1  foundation/          parsers, services, utils
L0  foundation/types/    constants, models, type definitions

Environment Variables

LLM (first match wins)

Variable Purpose Default
LLM_API_KEY Generic LLM key (highest priority) -
LLM_BASE_URL API endpoint https://api.openai.com/v1
LLM_MODEL Model name gpt-4o
OPENAI_API_KEY OpenAI or compatible -
MOONSHOT_API_KEY Moonshot / Kimi (legacy) -

Embedding

Variable Purpose Default
DASHSCOPE_API_KEY DashScope (Qwen3 Embedding) -
DASHSCOPE_BASE_URL DashScope endpoint https://dashscope.aliyuncs.com/api/v1

System

Variable Purpose Default
TERRAIN_WORKSPACE Workspace directory ~/.terrain

Installation Options

Install from PyPI

# Core (includes C/C++, Python, JS/TS grammars)
pip install terrain-ai

# With all language grammars (Rust, Go, Java, Scala, Lua)
pip install "terrain-ai[treesitter-full]"

Install from npm

# Global install (recommended for CLI usage)
npm install -g terrain-ai@latest

# Or run directly with npx (no install needed)
npx terrain-ai@latest --version

Install from local source

git clone https://github.com/JeremyJiao01/CodeGraphWiki.git
cd CodeGraphWiki

# Install with all language grammars
pip install ".[treesitter-full]"

# Or install in editable mode for development
pip install -e ".[treesitter-full]"

Build and install from wheel

git clone https://github.com/JeremyJiao01/CodeGraphWiki.git
cd CodeGraphWiki

python3 -m build
pip install dist/terrain_ai-*.whl

Development

git clone https://github.com/JeremyJiao01/CodeGraphWiki.git
cd CodeGraphWiki
pip install -e ".[treesitter-full]"

python3 -m pytest tests/ -v

# Integration tests (requires tinycc repo at ../tinycc)
python3 -m pytest tests/domains/core/test_graph_build.py -v      # ~3 min
python3 -m pytest tests/domains/upper/test_api_docs.py -v        # ~3 min
python3 -m pytest tests/domains/core/test_step3_embedding.py -v  # ~27 min (API calls)
python3 -m pytest tests/domains/upper/test_api_find_integration.py -v  # ~47 min (full pipeline)

License

Apache License 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

terrain_ai-2.1.11.tar.gz (404.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

terrain_ai-2.1.11-py3-none-any.whl (280.0 kB view details)

Uploaded Python 3

File details

Details for the file terrain_ai-2.1.11.tar.gz.

File metadata

  • Download URL: terrain_ai-2.1.11.tar.gz
  • Upload date:
  • Size: 404.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for terrain_ai-2.1.11.tar.gz
Algorithm Hash digest
SHA256 1f1c2e3d7a9b4378440e956f22be6d4371878439614007f7acd3de1d850471bd
MD5 fc40ee135276fda537f5a78899f01dc4
BLAKE2b-256 0b749f0630cefc5ee070fc3dc348e6a36ef032ec965031ab642d6535b23aa4fc

See more details on using hashes here.

File details

Details for the file terrain_ai-2.1.11-py3-none-any.whl.

File metadata

  • Download URL: terrain_ai-2.1.11-py3-none-any.whl
  • Upload date:
  • Size: 280.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for terrain_ai-2.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 4f116f0d63b268467abb1f3b84407bb9ceb06bcb06b015810a05a2159aa2fd83
MD5 353d7b9c75b075b42cad4b85062ed7ee
BLAKE2b-256 d2ceecb60a9a511bb5bebc3ebe316cd1f35d62d9ef7ecb52c1990afb3a391921

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page