Local Multi-Agent Repository Intelligence System

These details have not been verified by PyPI

Project links

Project description

Local Multi-Agent Repository Intelligence System

Vision

Build a fully local, privacy-first repository intelligence platform that helps developers understand, navigate, document, analyze, and reason about source code.

The goal is not to compete with cloud coding assistants such as Claude Code, Cursor, GitHub Copilot, or OpenAI Codex.

The system will:

Run locally
Use local LLMs
Never require source code to leave the machine
Focus on understanding rather than code generation
Be language-aware through AST parsing
Maintain a continuously updated repository knowledge graph
Support multiple specialized agents

The primary objective is to become a "repository expert" capable of answering questions, generating documentation, explaining architecture, performing impact analysis, and understanding code evolution over time.

Core Principles

1. Retrieval First

The quality of answers depends on retrieval quality.

The system should prioritize:

AST-aware indexing
Symbol-aware retrieval
Dependency-aware retrieval

over generic vector similarity search.

2. Code is a Graph

A repository is not a collection of files.

A repository is a graph of:

Packages
Modules
Classes
Traits
Interfaces
Functions
Methods
Dependencies
Imports
Call relationships

The system should maintain this graph as a first-class entity.

3. Local First

All processing should happen locally:

Parsing
Embedding generation
Retrieval
Reasoning

No external APIs are required.

4. Specialized Agents

Each agent should have a single responsibility.

Avoid creating one large autonomous agent.

Instead create multiple focused agents sharing a common knowledge layer.

High-Level Architecture

Repository

    │

    ▼

Indexing Agent

    │

    ▼

Repository Knowledge Layer

    ├── Symbol Store
    ├── Dependency Graph
    ├── Vector Store
    ├── Commit History
    └── Metadata

    │

    ▼

Agents

    ├── Documentation Agent ✅
    ├── Q&A Agent ✅
    ├── Git Agent ✅
    ├── Impact Analysis Agent ✅
    ├── Git Archaeology Agent (Planned)
    └── Future Agents

Technology Choices

Parsing

Use Tree-sitter.

Reason:

Mature ecosystem
Multi-language support
Incremental parsing
Existing grammars

Supported languages for MVP:

Scala
Java
Python

Future:

Go
Rust
Kotlin
C++
C#
TypeScript

Local LLM Runtime

Use Ollama.

Candidate models:

MVP

Qwen3 8B
Gemma 3 12B

Future

Qwen3 72B
DeepSeek R1 Distill

Embeddings

Candidate models:

nomic-embed-text
bge-large
gte-large

Embeddings should only assist retrieval.

They must not become the primary retrieval mechanism.

Agent Orchestration

Use LangGraph.

Reason:

Explicit workflows
State management
Tool orchestration
Easy future expansion

Avoid autonomous agent loops.

Prefer deterministic workflows.

Storage

Metadata Store

DuckDB

Stores:

symbols
files
relationships
commits
documentation

Vector Store

LanceDB

Stores:

embeddings
semantic search index

Alternative:

Qdrant

Future Graph Database

Optional.

Candidates:

KuzuDB
Neo4j

Do not introduce graph databases during MVP.

Repository Knowledge Layer

This is the most important component.

All agents interact through this layer.

Responsibilities:

Symbol lookup
Dependency traversal
Semantic retrieval
Impact analysis support
Commit history lookup

Example interface:

trait RepositoryKnowledgeService {

  def findSymbol(name: String)

  def findCallers(symbol: Symbol)

  def findCallees(symbol: Symbol)

  def retrieveContext(question: String)

  def impactedSymbols(symbol: Symbol)

}

This layer becomes the foundation of the entire platform.

MVP

Agent 1: Repository Indexing Agent

Responsibilities

Convert source code into structured knowledge.

Workflow

Repository

↓

Tree-sitter AST

↓

Symbol Extraction

↓

Dependency Extraction

↓

Embedding Generation

↓

Storage

Extracted Metadata

For every symbol:

{
  "symbol": "GraphRunner.retryExecuteNode",
  "type": "method",
  "file": "GraphRunner.scala",
  "language": "scala",
  "calls": [
    "attemptExecuteNode"
  ]
}

Incremental Updates

✅ Implemented via Git Agent

The system now includes a Git Agent that:

Detects changes via git diff
Tracks the last indexed commit
Re-indexes only changed files
Supports incremental indexing via CLI: maris index --incremental

This dramatically improves indexing performance for large repositories.

See Git Agent Documentation for details.

Agent 2: Documentation Agent

Responsibilities

Generate repository documentation.

Output

Architecture overview
Component documentation
Module descriptions
Dependency diagrams
Data flow descriptions

Important Rule

Never generate documentation directly from raw files.

Always use indexed symbols and repository graph data.

Agent 3: Repository Q&A Agent

Responsibilities

Answer questions about code.

Examples:

Explain GraphRunner
How does retry work?
Where is reducer used?
What happens when training starts?

Workflow

Question

↓

Retrieve Symbols

↓

Expand Dependencies

↓

Build Context

↓

LLM Reasoning

↓

Answer

Goal

Context should consist of relevant symbols.

Not arbitrary chunks.

Future Roadmap

Agent 4: Git Agent

✅ Implemented (June 2026)

Purpose:

Track repository changes and enable incremental indexing.

Capabilities:

Detect changes since last indexing
Categorize changes (added/modified/deleted/renamed)
Enable efficient incremental re-indexing
Track commit history

See Git Agent Documentation for details.

Agent 5: Impact Analysis Agent

✅ Implemented (June 2026)

Purpose:

Analyze the impact of code changes and help developers understand what will be affected by modifications.

Capabilities:

Dependency analysis: Find direct and indirect callers, callees, and affected files
Test discovery: Identify tests covering symbols and suggest missing scenarios
Edge case detection: Detect missing null checks, error handling, and boundary conditions
Breaking change detection: Identify potential breaking changes and affected callers
Recommendations: Generate actionable recommendations based on analysis

Integration:

Auto-routing: Orchestrator automatically routes impact-related questions (keywords: "impact", "affect", "break", "edge case", "test coverage")
Explicit CLI:
- maris impact analyze --symbol "SymbolName"
- maris impact edge-cases --file "path/to/file.py"
- maris impact tests --symbol "SymbolName"
- maris impact breaking-changes --symbol "SymbolName"
Implicit via ask: maris ask "What will be affected if I change X?"

Example:

# Auto-routed to Impact Analysis Agent
maris ask "What will be affected if I change GitAgent?"

# Explicit impact analysis
maris impact analyze --symbol "GitAgent.detect_changes"
maris impact edge-cases --file "src/maris/agents/git_agent.py"
maris impact tests --symbol "QAAgent.answer_question"

See Impact Analysis Agent Documentation for details.

Agent 6: Git Archaeology Agent

Purpose:

Understand historical code evolution.

Questions:

When was this bug introduced?
Who changed this logic?
Why was this method added?

Data Sources:

git log
git blame
commit metadata

Capabilities:

commit timeline generation
code evolution summaries
regression identification

Agent 6: Test Suggestion Agent

Purpose:

Suggest tests based on modifications.

Inputs:

changed symbols
dependency graph
historical bugs

Outputs:

missing tests
edge cases
regression scenarios

Agent 7: Architecture Evolution Agent

Purpose:

Track architecture changes over time.

Capabilities:

detect coupling growth
detect module boundaries
identify hotspots
detect architectural drift

Retrieval Strategy

Do Not

Generic chunking:

1000 token chunks

This loses structure.

Preferred

AST-based symbol chunking.

Example:

Package

  ├── Class

        ├── Method

        ├── Method

        └── Method

Each symbol becomes a retrievable unit.

Retrieval Pipeline

Question

↓

Vector Search

↓

Symbol Expansion

↓

Dependency Expansion

↓

Context Assembly

↓

Reasoning

This combines semantic search with graph traversal.

Non Goals

The system is NOT intended to:

Generate PRs
Automatically modify code
Replace developers
Act autonomously
Execute arbitrary repository changes

The system is designed to help developers understand software.

Success Criteria

MVP is successful when:

✅ Repository indexing works incrementally (Git Agent)
✅ Symbols can be queried accurately
✅ Documentation can be generated automatically
✅ Q&A answers are grounded in repository knowledge
✅ Entire workflow runs locally
✅ No external API dependencies are required

MVP Complete! All success criteria have been met.

Long-Term Goal

Become a local repository intelligence platform capable of understanding large codebases as well as experienced maintainers, while remaining privacy-first, language-aware, and fully developer-controlled.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Jun 27, 2026

0.1.9

Jun 25, 2026

0.1.8

Jun 24, 2026

0.1.7

Jun 24, 2026

0.1.6

Jun 24, 2026

0.1.5

Jun 24, 2026

0.1.4

Jun 24, 2026

0.1.3

Jun 24, 2026

0.1.2

Jun 24, 2026

0.1.1

Jun 24, 2026

This version

0.1.0

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maris-0.1.0.tar.gz (94.3 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

maris-0.1.0-py3-none-any.whl (75.4 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file maris-0.1.0.tar.gz.

File metadata

Download URL: maris-0.1.0.tar.gz
Upload date: Jun 23, 2026
Size: 94.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for maris-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a956eee0d2927a649b8df07271c18254dc105aa7dc471434f393c26b3aefd762`
MD5	`b7d08660c93ee7204849df87ae56b42c`
BLAKE2b-256	`c70e47a784da8b8e963a82da97ec097dd87eebaf1a78eca3060c793fab468b25`

See more details on using hashes here.

File details

Details for the file maris-0.1.0-py3-none-any.whl.

File metadata

Download URL: maris-0.1.0-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 75.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for maris-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4b8fd7bb2c3af999d42f2172c8befcb3a7aae032ab4d995875ba8da1c0dfac6d`
MD5	`98b2e024f8fa7ca2fbfe1aa8e516d543`
BLAKE2b-256	`1df0550229f9692e30d139666143cad0e25d266f5a46db77eb2976ec532e2f44`

See more details on using hashes here.

maris 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Local Multi-Agent Repository Intelligence System

Vision

Core Principles

1. Retrieval First

2. Code is a Graph

3. Local First

4. Specialized Agents

High-Level Architecture

Technology Choices

Parsing

Local LLM Runtime

MVP

Recommended

Future

Embeddings

Agent Orchestration

Storage

Metadata Store

Vector Store

Future Graph Database

Repository Knowledge Layer

MVP

Agent 1: Repository Indexing Agent

Responsibilities

Workflow

Extracted Metadata

Incremental Updates

Agent 2: Documentation Agent

Responsibilities

Output

Important Rule

Agent 3: Repository Q&A Agent

Responsibilities

Workflow

Goal

Future Roadmap

Agent 4: Git Agent

Agent 5: Impact Analysis Agent

Agent 6: Git Archaeology Agent

Agent 6: Test Suggestion Agent

Agent 7: Architecture Evolution Agent

Retrieval Strategy

Do Not

Preferred

Retrieval Pipeline

Non Goals

Success Criteria

Long-Term Goal

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes