Skip to main content

FastMCP server providing LLM-optimized access to Scalene profiler

Project description

Scalene-MCP

A FastMCP v2 server providing LLMs with structured access to Scalene's comprehensive CPU, GPU, and memory profiling capabilities.

Installation

Prerequisites

  • Python 3.10+
  • uv (recommended) or pip

From Source

git clone https://github.com/plasma-umass/scalene-mcp.git
cd scalene-mcp
uv venv
uv sync

As a Package

pip install scalene-mcp

Quick Start: Running the Server

Development Mode

# Using uv
uv run scalene_mcp.server

# Using pip
python -m scalene_mcp.server

Production Mode

python -m scalene_mcp.server

๐ŸŽฏ Native Integration with VSCode LLM Editors

Works seamlessly with:

Zero-Friction Setup (3 Steps)

  1. Install

    pip install scalene-mcp
    
  2. Configure - Choose one method:

    Automated (Recommended):

    python scripts/setup_vscode.py
    

    Interactive setup script auto-finds your editor and configures it.

    Manual - GitHub Copilot:

    // .vscode/settings.json
    {
      "github.copilot.chat.mcp.servers": {
        "scalene": {
          "command": "uv",
          "args": ["run", "-m", "scalene_mcp.server"]
        }
      }
    }
    

    Manual - Claude Code / Cursor: See editor-specific setup guides

  3. Restart VSCode/Cursor and start profiling!

Start Profiling Immediately

Open any Python project and ask your LLM:

"Profile main.py and show me the bottlenecks"

The LLM automatically:

  • ๐Ÿ” Detects your project structure
  • ๐Ÿ“„ Finds and profiles your code
  • ๐Ÿ“Š Analyzes CPU, memory, GPU usage
  • ๐Ÿ’ก Suggests optimizations

No path thinking. No manual configuration. Zero friction.

๐Ÿ“š Editor-Specific Setup:

๐Ÿ“š Full docs: SETUP_VSCODE.md | QUICKSTART.md | TOOLS_REFERENCE.md

Available Serving Methods (FastMCP)

Scalene-MCP can be served in multiple ways using FastMCP's built-in serving capabilities:

1. Standard Server (Default)

# Starts an MCP-compatible server on stdio
python -m scalene_mcp.server

2. With Claude Desktop

Configure in your claude_desktop_config.json:

{
  "mcpServers": {
    "scalene": {
      "command": "python",
      "args": ["-m", "scalene_mcp.server"]
    }
  }
}

Then restart Claude Desktop.

3. With HTTP/SSE Endpoint

# If using fastmcp with HTTP support
uv run --help  # Check FastMCP documentation for HTTP serving

4. With Environment Variables

# Configure via environment
export SCALENE_PYTHON_EXECUTABLE=python3.11
export SCALENE_TIMEOUT=30
python -m scalene_mcp.server

5. Programmatically

from fastmcp import Server

# Create and run server programmatically
server = create_scalene_server()
# Configure and start...

Programmatic Usage

Use Scalene-MCP directly in your Python code:

from scalene_mcp.profiler import ScaleneProfiler
import asyncio

async def main():
    profiler = ScaleneProfiler()
    
    # Profile a script
    result = await profiler.profile(
        type="script",
        script_path="fibonacci.py",
        include_memory=True,
        include_gpu=False
    )
    
    print(f"Profile ID: {result['profile_id']}")
    print(f"Peak memory: {result['summary'].get('total_memory_mb', 'N/A')}MB")
    
asyncio.run(main())

Overview

Scalene-MCP transforms Scalene's powerful profiling output into an LLM-friendly format through a clean, minimal set of well-designed tools. Get detailed performance insights without images or excessive context overhead.

What Scalene-MCP Does

  • โœ… Profile Python scripts with full Scalene feature set
  • โœ… Analyze profiles for hotspots, bottlenecks, memory leaks
  • โœ… Compare profiles to detect regressions
  • โœ… Pass arguments to profiled scripts
  • โœ… Structured output in JSON format for LLMs
  • โœ… Async execution for non-blocking profiling

What Scalene-MCP Doesn't Do

  • โŒ In-process profiling (Scalene.start()/stop()) - uses subprocess instead for isolation
  • โŒ Process attachment (--pid based profiling) - profiles scripts, not running processes
  • โŒ Single-function profiling - designed for complete script analysis

Note: The subprocess-based approach was chosen for reliability and simplicity. LLM workflows typically profile complete scripts, which is a perfect fit. See SCALENE_MODES_ANALYSIS.md for detailed scope analysis.

Key Features

  • Complete CPU profiling: Line-by-line Python/C time, system time, CPU utilization
  • Memory profiling: Peak/average memory per line, leak detection with velocity metrics
  • GPU profiling: NVIDIA and Apple GPU support with per-line attribution
  • Advanced analysis: Stack traces, bottleneck identification, performance recommendations
  • Profile comparison: Track performance changes across runs
  • LLM-optimized: Structured JSON output, summaries before details, context-aware formatting

Available Tools (7 Consolidated Tools)

Scalene-MCP provides a clean, LLM-optimized set of 7 tools:

Discovery (3 tools)

  • get_project_root() - Auto-detect project structure
  • list_project_files(pattern, max_depth) - Find files by glob pattern
  • set_project_context(project_root) - Override auto-detection

Profiling (1 unified tool)

  • profile(type, script_path/code, ...) - Profile scripts or code snippets
    • type="script" for script profiling
    • type="code" for code snippet profiling

Analysis (1 mega tool)

  • analyze(profile_id, metric_type, ...) - 9 analysis modes in one tool:
    • metric_type="all" - Comprehensive analysis
    • metric_type="cpu" - CPU hotspots
    • metric_type="memory" - Memory hotspots
    • metric_type="gpu" - GPU hotspots
    • metric_type="bottlenecks" - Performance bottlenecks
    • metric_type="leaks" - Memory leak detection
    • metric_type="file" - File-level metrics
    • metric_type="functions" - Function-level metrics
    • metric_type="recommendations" - Optimization suggestions

Comparison & Storage (2 tools)

  • compare_profiles(before_id, after_id) - Compare two profiles
  • list_profiles() - View all captured profiles

Full reference: See TOOLS_REFERENCE.md

Configuration

Profiling Options

The unified profile() tool supports these options:

Option Type Default Description
type str required "script" or "code"
script_path str None Required if type="script"
code str None Required if type="code"
include_memory bool true Profile memory
include_gpu bool false Profile GPU usage
cpu_only bool false Skip memory/GPU profiling
reduced_profile bool false Only report high-activity lines
cpu_percent_threshold float 1.0 Minimum CPU% to report
malloc_threshold int 100 Minimum allocation size (bytes)
profile_only str "" Profile only paths containing this
profile_exclude str "" Exclude paths containing this
use_virtual_time bool false Use virtual time instead of wall time
script_args list [] Command-line arguments for the script

Environment Variables

  • SCALENE_CPU_PERCENT_THRESHOLD: Override default CPU threshold
  • SCALENE_MALLOC_THRESHOLD: Override default malloc threshold

Architecture

Components

  • ScaleneProfiler: Async wrapper around Scalene CLI
  • ProfileParser: Converts Scalene JSON to structured models
  • ProfileAnalyzer: Extracts insights and hotspots
  • ProfileComparator: Compares profiles for regressions
  • FastMCP Server: Exposes tools via MCP protocol

Data Flow

Python Script
    โ†“
ScaleneProfiler (subprocess)
    โ†“
Scalene CLI (--json)
    โ†“
Temp JSON File
    โ†“
ProfileParser
    โ†“
Pydantic Models (ProfileResult)
    โ†“
Analyzer / Comparator
    โ†“
MCP Tools
    โ†“
LLM Client

Troubleshooting

GPU Permission Error

If you see PermissionError when profiling with GPU:

# Disable GPU profiling in test environments
result = await profiler.profile(
    type="script",
    script_path="script.py",
    include_gpu=False
)

Profile Not Found

Profiles are stored in memory during the server session. For persistence, implement the storage interface.

Timeout Issues

Adjust the timeout parameter (if using profiler directly):

result = await profiler.profile(
    type="script",
    script_path="slow_script.py"
)

Development

Running Tests

# All tests with coverage
uv run pytest -v --cov=src/scalene_mcp

# Specific test file
uv run pytest tests/test_profiler.py -v

# With coverage report
uv run pytest --cov=src/scalene_mcp --cov-report=html

Code Quality

# Type checking
uv run mypy src/

# Linting
uv run ruff check src/

# Formatting
uv run ruff format src/

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass and coverage โ‰ฅ 85%
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Citation

If you use Scalene-MCP in research, please cite both this project and Scalene:

@software{scalene_mcp,
  title={Scalene-MCP: LLM-Friendly Profiling Server},
  year={2026}
}

@inproceedings{berger2020scalene,
  title={Scalene: Scripting-Language Aware Profiling for Python},
  author={Berger, Emery},
  year={2020}
}

Support

  • Issues: GitHub Issues for bug reports and feature requests
  • Discussions: GitHub Discussions for questions and ideas
  • Documentation: See docs/ directory

Made with โค๏ธ for the Python performance community.

Manual Installation

pip install -e .

Development

Prerequisites

  • Python 3.10+
  • uv (recommended) or pip

Setup

# Install dependencies
uv sync

# Run tests
just test

# Run tests with coverage
just test-cov

# Lint and format
just lint
just format

# Type check
just typecheck

# Full build (sync + lint + typecheck + test)
just build

Project Structure

scalene-mcp/
โ”œโ”€โ”€ src/scalene_mcp/     # Main package
โ”‚   โ”œโ”€โ”€ server.py        # FastMCP server with tools/resources/prompts
โ”‚   โ”œโ”€โ”€ models.py        # Pydantic data models
โ”‚   โ”œโ”€โ”€ profiler.py      # Scalene execution wrapper
โ”‚   โ”œโ”€โ”€ parser.py        # JSON output parser
โ”‚   โ”œโ”€โ”€ analyzer.py      # Analysis engine
โ”‚   โ”œโ”€โ”€ comparator.py    # Profile comparison
โ”‚   โ”œโ”€โ”€ recommender.py   # Optimization recommendations
โ”‚   โ”œโ”€โ”€ storage.py       # Profile persistence
โ”‚   โ””โ”€โ”€ utils.py         # Shared utilities
โ”œโ”€โ”€ tests/               # Test suite (100% coverage goal)
โ”‚   โ”œโ”€โ”€ fixtures/        # Test data
โ”‚   โ”‚   โ”œโ”€โ”€ profiles/    # Sample profile outputs
โ”‚   โ”‚   โ””โ”€โ”€ scripts/     # Test Python scripts
โ”‚   โ””โ”€โ”€ conftest.py      # Shared test fixtures
โ”œโ”€โ”€ examples/            # Usage examples
โ”œโ”€โ”€ docs/                # Documentation
โ”œโ”€โ”€ pyproject.toml       # Project configuration
โ”œโ”€โ”€ justfile             # Task runner commands
โ””โ”€โ”€ README.md            # This file

Usage

Running the Server

# Development mode with auto-reload
fastmcp dev src/scalene_mcp/server.py

# Production mode
fastmcp run src/scalene_mcp/server.py

# Install to MCP config
fastmcp install src/scalene_mcp/server.py

Example: Profile a Script

# Through MCP client
result = await client.call_tool(
    "profile",
    arguments={
        "script_path": "my_script.py",
        "cpu": True,
        "memory": True,
        "gpu": False,
    }
)

Example: Analyze Results

# Get analysis and recommendations
analysis = await client.call_tool(
    "analyze",
    arguments={"profile_id": result["profile_id"]}
)

Testing

The project maintains 100% test coverage with comprehensive test suites:

# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov=src --cov-report=html

# Run specific test file
uv run pytest tests/test_server.py

# Run with verbose output
uv run pytest -v

Test fixtures include:

  • Sample profiling scripts (fibonacci, memory-intensive, leaky)
  • Realistic Scalene JSON outputs
  • Edge cases and error conditions

Code Quality

This project follows strict code quality standards:

  • Type Safety: 100% mypy strict mode compliance
  • Linting: ruff with comprehensive rules
  • Testing: 100% coverage requirement
  • Style: Sleek-modern documentation, minimal functional emoji usage
  • Patterns: FastMCP best practices throughout

Development Phases

Current Status: Phase 1.1 - Project Setup โœ“

Documentation

Editor Setup Guides:

API & Usage:

Development Roadmap

  1. Phase 1: Project Setup & Infrastructure โœ“
  2. Phase 2: Core Data Models (In Progress)
  3. Phase 3: Profiler Integration
  4. Phase 4: Analysis & Insights
  5. Phase 5: Comparison Features
  6. Phase 6: Resources Implementation
  7. Phase 7: Prompts & Workflows
  8. Phase 8: Testing & Quality
  9. Phase 9: Documentation
  10. Phase 10: Polish & Release

See development-plan.md for detailed roadmap.

Contributing

Contributions are welcome! Please ensure:

  • All tests pass (just test)
  • Linting passes (just lint)
  • Type checking passes (just typecheck)
  • Code coverage remains at 100%

License

[License TBD]

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scalene_mcp-0.1.0.tar.gz (232.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scalene_mcp-0.1.0-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file scalene_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: scalene_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 232.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scalene_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 95e62a215706e44a27fa1503f63bd147199007089296b6d51215d2f8d51d847e
MD5 dcb717c8d2e8ad631ca5e2682bdececf
BLAKE2b-256 6f33bcb18deecf69a415e9235edbc32247798d6b569c94a7b11d793641bc7e3a

See more details on using hashes here.

File details

Details for the file scalene_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: scalene_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scalene_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 de4937afceb24fe548a15163f31e8ca06d593c6475d80f571aa4937b80d33d61
MD5 5f2ec38799bf9831d2d45a0e13d545e7
BLAKE2b-256 20d3485d05817894d5269e577077cc049a2387eb95985493d059ef762496e76e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page