Skip to main content

Large Context Orchestrator for GitLab Duo -- chunk oversized MRs and run multi-agent security analysis.

Project description

Mr Ninja

Large repo merge request assistant -- chunk oversized Merge Requests and run multi-agent security analysis.

AI agents often caps context at limited tokens per agent call. A single MR in a monorepo can generate 500k-1M call of diff content, causing truncated reviews and missed vulnerabilities. Mr Ninja solves this by intelligently decomposing large MRs into priority-sorted chunks, processing each through specialist agents, and posting a unified report.


Install

pip install mr-ninja

Or from source:

git clone https://github.com/your-username/mr-ninja.git
cd mr-ninja
pip install -e ".[dev]"

Note: PyPI publishing is not yet configured. Install from source with pip install -e ".[dev]".

Quick Start

Analyze a GitLab MR

export GITLAB_TOKEN="glpat-xxxxxxxxxxxxxxxxxxxx"

# Analyze by URL
mr-ninja analyze https://gitlab.com/group/project/-/merge_requests/42

# Analyze by project + MR IID
mr-ninja analyze --project group/project --mr 42

# Post results as an MR comment
mr-ninja analyze https://gitlab.com/group/project/-/merge_requests/42 --post-comment

# Save report to file
mr-ninja analyze https://gitlab.com/group/project/-/merge_requests/42 -o report.md

Analyze a GitHub PR

export GITHUB_TOKEN="ghp_xxxxxxxxxxxxxxxxxxxx"

# Analyze by URL
mr-ninja analyze https://github.com/owner/repo/pull/42

# Post results as a PR comment
mr-ninja analyze https://github.com/owner/repo/pull/42 --post-comment

# Using fields instead of URL
mr-ninja analyze --project owner/repo --mr 42 --github-token ghp-xxx

# REST API
curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"mr_url": "https://github.com/owner/repo/pull/42",
       "github_token": "ghp_xxx"}'

Run a Demo (No GitLab Required)

# Simulate a 512-file MR analysis
mr-ninja demo

# Custom file count
mr-ninja demo --files 1000

# Save report
mr-ninja demo --files 512 -o report.md

Start the REST API Server

mr-ninja serve
mr-ninja serve --host 127.0.0.1 --port 9000  # custom port example

# Then call the API (default port is 8000)
curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"mr_url": "https://gitlab.com/group/project/-/merge_requests/42", "gitlab_token": "glpat-xxx"}'

# Run demo via API (no token needed)
curl -X POST http://localhost:8000/demo

Docker

docker build -t mr-ninja .
docker run -p 8000:8000 -e GITLAB_TOKEN=glpat-xxx mr-ninja

How It Works

500+ file MR (~800k tokens)
         |
         v
   +--------------+
   |   Mr Ninja   |
   | Orchestrator |
   +------+-------+
          |
    +-----+-----+
    |     |     |
    v     v     v
 Chunk  Chunk  Chunk     Each fits within Duo's context limit
  1      2      3
  |      |      |
  v      v      v
 Agent  Agent  Agent     Security / Code Review / Dependency
  |      |      |
  +------+------+
         |
    +----v----+
    |Aggregate|
    | & Post  |
    +---------+
         |
         v
   Unified MR Report
  1. Detect -- Estimates token footprint. If >150k tokens, activates chunking.
  2. Plan -- Classifies files by priority (security-critical first, tests last) and bin-packs into ~70k-token chunks.
  3. Process -- Runs each chunk through specialist agents (Security Analyst, Code Reviewer, Dependency Analyzer).
  4. Carry Context -- Generates compact cross-chunk summaries so findings and dependencies are tracked across chunks.
  5. Aggregate -- Deduplicates findings, ranks by severity, calculates risk score.
  6. Report -- Posts a unified Markdown report as an MR comment.

File Priority System

Priority Category Examples Order
P1 Security-critical .env, Dockerfile, *.tf, auth/*, *.pem First
P2 Entry points main.*, app.*, routes/*, api/* Second
P3 Changed files All other source files Third
P4 Shared modules Imported by multiple changed files Fourth
P5 Test files tests/*, *_test.*, *.spec.* Last
P6 Generated package-lock.json, *.min.js, dist/* Skipped

Specialist Agents

Agent Detects
Security Analyst Hardcoded secrets, SQL injection, XSS, eval/exec, shell injection, SSL bypass, pickle, private keys
Code Reviewer Bare exceptions, debug prints, TODO/FIXME, global state, long sleeps
Dependency Analyzer Wildcard version pins, deprecated packages, broad version ranges

Example Output

Risk Level: CRITICAL (Score: 85/100)
Files scanned: 512
Chunks processed: 6
Processing time: 2.3s

Critical vulnerabilities: 8
High vulnerabilities: 15
Medium issues: 22

Top Issues:
1. [CRITICAL] auth/handler.py -- Hardcoded API key (sk-live-...)
2. [CRITICAL] payments/.env -- Database password in source
3. [CRITICAL] orders/auth_handler.py -- SQL injection via string concat
4. [HIGH] gateway/src/handler.py -- Unsafe eval() execution
5. [HIGH] users/service.py -- Shell injection (subprocess shell=True)

Recommendation: BLOCK MERGE -- resolve all CRITICAL findings before merging.

Project Structure

mr-ninja/
├── src/mr_ninja/              # Installable package
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py                 # CLI: analyze, demo, serve
│   ├── server.py              # FastAPI REST API
│   ├── agents/
│   │   ├── orchestrator.py    # Central coordinator
│   │   ├── chunk_planner.py   # MR diff → chunk plan
│   │   ├── chunk_processor.py # Specialist agent runner
│   │   ├── summarizer.py      # Cross-chunk context manager
│   │   └── aggregator.py      # Findings dedup & report
│   ├── core/
│   │   ├── models.py          # Pydantic data models
│   │   ├── token_estimator.py # Token count estimation
│   │   └── chunking_engine.py # File classification & bin-packing
│   ├── gitlab/
│   │   └── gitlab_client.py   # GitLab REST API client (stdlib only)
│   ├── demo/
│   │   ├── simulate_large_mr.py
│   │   └── generate_large_repo.py
│   └── flows/
│       └── agent_flow.yaml
├── tests/
│   ├── test_token_estimator.py
│   ├── test_chunking.py
│   ├── test_aggregation.py
│   ├── test_orchestrator.py
│   ├── test_cli.py
│   ├── test_gitlab_client.py
│   ├── test_demo.py
│   └── test_models.py
├── public/                    # Static website (GitHub Pages)
├── docs/                      # Architecture and demo guides
├── scripts/                   # run_demo.sh / run_demo.ps1
├── .github/workflows/ci.yml   # GitHub Actions CI/CD
├── pyproject.toml
├── Dockerfile
├── AGENTS.md
├── CONTRIBUTING.md
├── LICENSE                    # Apache 2.0
└── README.md

API Reference

Method Endpoint Description
GET /health Health check
POST /analyze Analyze a GitLab MR
POST /demo Run demo (no GitLab needed)
GET /docs Swagger UI
GET /redoc ReDoc API docs

POST /analyze

{
  "mr_url": "https://gitlab.com/group/project/-/merge_requests/42",
  "gitlab_token": "glpat-xxxxxxxxxxxxxxxxxxxx",
  "max_chunk_tokens": 70000,
  "post_comment": true
}

Response

{
  "status": "ok",
  "mr_id": "42",
  "chunks_processed": 6,
  "total_findings": 45,
  "critical_findings": 8,
  "overall_risk": "CRITICAL",
  "report_markdown": "# Mr Ninja Analysis Report\n...",
  "processing_time_seconds": 2.3
}

Python API

from mr_ninja.agents.orchestrator import Orchestrator

orchestrator = Orchestrator(
    gitlab_url="https://gitlab.com",
    gitlab_token="glpat-xxx",
    post_comments=True,
)

# Analyze by project + MR IID
report = orchestrator.analyze_mr("group/project", 42)

# Analyze by URL
report = orchestrator.analyze_mr_from_url(
    "https://gitlab.com/group/project/-/merge_requests/42"
)

print(f"Risk: {report.overall_risk.value}")
print(f"Findings: {len(report.findings)}")

Development

git clone https://github.com/namdpran8/mr-ninja.git
cd mr-ninja
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=mr_ninja --cov-report=term-missing

# Lint
ruff check src/

# Type check
mypy src/mr_ninja/core/

See CONTRIBUTING.md for packaging, publishing, and release guides.


Tech Stack

Component Technology
Language Python 3.11+
API Framework FastAPI
Data Models Pydantic v2
HTTP Client urllib (stdlib -- zero dependencies)
CI/CD GitHub Actions
Container Docker
Testing pytest + pytest-cov

License

Copyright [2026] [Pranshu Namdeo and Chukwunonso Richard Iwenor]

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mr_ninja-2.0.0.tar.gz (68.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mr_ninja-2.0.0-py3-none-any.whl (62.7 kB view details)

Uploaded Python 3

File details

Details for the file mr_ninja-2.0.0.tar.gz.

File metadata

  • Download URL: mr_ninja-2.0.0.tar.gz
  • Upload date:
  • Size: 68.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mr_ninja-2.0.0.tar.gz
Algorithm Hash digest
SHA256 812bf15e81580bf48a224183c493fec1144af6b9ba6608dc905beda2c4806c40
MD5 9d38fc587c12d3827d4fd16b0899cbf5
BLAKE2b-256 30acc9f2423e84c47b758c9c6081e5d4fcaa404c4b4d2a91cd08e2841c404ed8

See more details on using hashes here.

Provenance

The following attestation bundles were made for mr_ninja-2.0.0.tar.gz:

Publisher: publish.yml on namdpran8/mr-ninja

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mr_ninja-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: mr_ninja-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 62.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mr_ninja-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 783ee57fd28389e5b91de4e1926365b9e325a87f6e711cc6cd1090cb24c9e9d7
MD5 a019aac4e05a24bd10ec2dd4f6cb11cd
BLAKE2b-256 11dfce41f6c5f0b9f8d6b1d6265896dcc3ce10c89d7b6414724c2ac46490d47d

See more details on using hashes here.

Provenance

The following attestation bundles were made for mr_ninja-2.0.0-py3-none-any.whl:

Publisher: publish.yml on namdpran8/mr-ninja

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page