Skip to main content

VGGT 3D reconstruction optimized for Apple Silicon with sparse attention

Project description

VGGT-MPS: 3D Vision Agent for Apple Silicon

Version Python License MPS

๐ŸŽ VGGT (Visual Geometry Grounded Transformer) optimized for Apple Silicon with Metal Performance Shaders (MPS)

Transform single or multi-view images into rich 3D reconstructions using Facebook Research's VGGT model, now accelerated on M1/M2/M3 Macs.

๐ŸŽ‰ Release v2.0.0

Major Update: Complete packaging overhaul with unified CLI, PyPI-ready distribution, and production-grade tooling!

โœจ What's New in v2.0.0

๐ŸŽฏ Major Changes

  • Unified CLI: New vggt command with subcommands for all operations
  • Professional Packaging: PyPI-ready with pyproject.toml, proper src layout
  • Web Interface: Gradio UI for interactive 3D reconstruction (vggt web)
  • Enhanced Testing: Comprehensive test suite with MPS and sparse attention tests
  • Modern Tooling: UV support, Makefile automation, GitHub Actions CI/CD

๐Ÿš€ Core Features

  • MPS Acceleration: Full GPU acceleration on Apple Silicon using Metal Performance Shaders
  • โšก Sparse Attention: O(n) memory scaling for city-scale reconstruction (100x savings!)
  • ๐ŸŽฅ Multi-View 3D Reconstruction: Generate depth maps, point clouds, and camera poses from images
  • ๐Ÿ”ง MCP Integration: Model Context Protocol server for Claude Desktop integration
  • ๐Ÿ“ฆ 5GB Model: Efficient 1B parameter model that runs smoothly on Apple Silicon
  • ๐Ÿ› ๏ธ Multiple Export Formats: PLY, OBJ, GLB for 3D point clouds

๐ŸŽฏ What VGGT Does

VGGT reconstructs 3D scenes from images by predicting:

  • Depth Maps: Per-pixel depth estimation
  • Camera Poses: 6DOF camera parameters
  • 3D Point Clouds: Dense 3D reconstruction
  • Confidence Maps: Reliability scores for predictions

๐Ÿ“‹ Requirements

  • Apple Silicon Mac (M1/M2/M3)
  • Python 3.10+
  • 8GB+ RAM
  • 6GB disk space for model

๐Ÿš€ Quick Start

Installation Options

Option A: Install from PyPI (Coming Soon)

# Install from PyPI (when published)
pip install vggt-mps

# Download model weights (5GB)
vggt download

Option B: Install from Source with UV (Recommended for Development)

git clone https://github.com/jmanhype/vggt-mps.git
cd vggt-mps

# Install with uv (10-100x faster than pip!)
make install

# Or manually with uv
uv pip install -e .

Option C: Traditional pip install from Source

git clone https://github.com/jmanhype/vggt-mps.git
cd vggt-mps

# Create virtual environment
python -m venv vggt-env
source vggt-env/bin/activate

# Install dependencies
pip install -r requirements.txt

2. Download Model Weights

# Download the 5GB VGGT model
vggt download

# Or if running from source:
python main.py download

Or manually download from Hugging Face

3. Test MPS Support

# Test MPS acceleration
vggt test --suite mps

# Or from source:
python main.py test --suite mps

Expected output:

โœ… MPS (Metal Performance Shaders) available!
   Running on Apple Silicon GPU
โœ… Model weights loaded to mps
โœ… MPS operations working correctly!

4. Setup Environment (Optional)

# Copy environment configuration
cp .env.example .env

# Edit .env with your settings
nano .env

๐Ÿ“– Usage

CLI Commands (v2.0.0)

All functionality is accessible through the unified vggt command:

# Quick demo with sample images
vggt demo

# Demo with kitchen dataset (4 images)
vggt demo --kitchen --images 4

# Process your own images
vggt reconstruct data/*.jpg

# Use sparse attention for large scenes
vggt reconstruct --sparse data/*.jpg

# Export to specific format
vggt reconstruct --export ply data/*.jpg

# Launch interactive web interface
vggt web

# Open on specific port with public link
vggt web --port 8080 --share

# Run comprehensive tests
vggt test --suite all

# Test sparse attention specifically
vggt test --suite sparse

# Benchmark performance
vggt benchmark --compare

# Download model weights
vggt download

From Source (Development)

If running from source without installation:

python main.py demo
python main.py reconstruct data/*.jpg
python main.py web
python main.py test --suite mps
python main.py benchmark --compare

๐Ÿ”ง MCP Server Integration

Add to Claude Desktop

  1. Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
  "mcpServers": {
    "vggt-agent": {
      "command": "uv",
      "args": [
        "run",
        "--python",
        "/path/to/vggt-mps/vggt-env/bin/python",
        "--with",
        "fastmcp",
        "fastmcp",
        "run",
        "/path/to/vggt-mps/src/vggt_mps_mcp.py"
      ]
    }
  }
}
  1. Restart Claude Desktop

Available MCP Tools

  • vggt_quick_start_inference - Quick 3D reconstruction from images
  • vggt_extract_video_frames - Extract frames from video
  • vggt_process_images - Full VGGT pipeline
  • vggt_create_3d_scene - Generate GLB 3D files
  • vggt_reconstruct_3d_scene - Multi-view reconstruction
  • vggt_visualize_reconstruction - Create visualizations

๐Ÿ“ Project Structure

vggt-mps/
โ”œโ”€โ”€ main.py                      # Single entry point
โ”œโ”€โ”€ setup.py                     # Package installation
โ”œโ”€โ”€ requirements.txt             # Dependencies
โ”œโ”€โ”€ .env.example                 # Environment configuration
โ”‚
โ”œโ”€โ”€ src/                         # Source code
โ”‚   โ”œโ”€โ”€ config.py               # Centralized configuration
โ”‚   โ”œโ”€โ”€ vggt_core.py            # Core VGGT processing
โ”‚   โ”œโ”€โ”€ vggt_sparse_attention.py # Sparse attention (O(n) scaling)
โ”‚   โ”œโ”€โ”€ visualization.py        # 3D visualization utilities
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ commands/               # CLI commands
โ”‚   โ”‚   โ”œโ”€โ”€ demo.py            # Demo command
โ”‚   โ”‚   โ”œโ”€โ”€ reconstruct.py     # Reconstruction command
โ”‚   โ”‚   โ”œโ”€โ”€ test_runner.py     # Test runner
โ”‚   โ”‚   โ”œโ”€โ”€ benchmark.py       # Performance benchmarking
โ”‚   โ”‚   โ””โ”€โ”€ web_interface.py   # Gradio web app
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ utils/                  # Utilities
โ”‚       โ”œโ”€โ”€ model_loader.py    # Model management
โ”‚       โ”œโ”€โ”€ image_utils.py     # Image processing
โ”‚       โ””โ”€โ”€ export.py          # Export to PLY/OBJ/GLB
โ”‚
โ”œโ”€โ”€ tests/                       # Organized test suite
โ”‚   โ”œโ”€โ”€ test_mps.py            # MPS functionality tests
โ”‚   โ”œโ”€โ”€ test_sparse.py         # Sparse attention tests
โ”‚   โ””โ”€โ”€ test_integration.py    # End-to-end tests
โ”‚
โ”œโ”€โ”€ data/                        # Input data directory
โ”œโ”€โ”€ outputs/                     # Output directory
โ”œโ”€โ”€ models/                      # Model storage
โ”‚
โ”œโ”€โ”€ docs/                        # Documentation
โ”‚   โ”œโ”€โ”€ API.md                  # API documentation
โ”‚   โ”œโ”€โ”€ SPARSE_ATTENTION.md    # Technical details
โ”‚   โ””โ”€โ”€ BENCHMARKS.md          # Performance results
โ”‚
โ””โ”€โ”€ LICENSE                      # MIT License

๐Ÿ–ผ๏ธ Usage Examples

Process Images

from src.tools.readme import vggt_quick_start_inference

result = vggt_quick_start_inference(
    image_directory="./tmp/inputs",
    device="mps",  # Use Apple Silicon GPU
    max_images=4,
    save_outputs=True
)

Extract Video Frames

from src.tools.demo_gradio import vggt_extract_video_frames

result = vggt_extract_video_frames(
    video_path="input_video.mp4",
    frame_interval_seconds=1.0
)

Create 3D Scene

from src.tools.demo_viser import vggt_reconstruct_3d_scene

result = vggt_reconstruct_3d_scene(
    images_dir="./tmp/inputs",
    device_type="mps",
    confidence_threshold=0.5
)

โšก Sparse Attention - NEW!

City-scale 3D reconstruction is now possible! We've implemented Gabriele Berton's research idea for O(n) memory scaling.

๐ŸŽฏ Key Benefits

  • 100x memory savings for 1000 images
  • No retraining required - patches existing VGGT at runtime
  • Identical outputs to regular VGGT (0.000000 difference)
  • MegaLoc covisibility detection for smart attention masking

๐Ÿš€ Usage

from src.vggt_sparse_attention import make_vggt_sparse

# Convert any VGGT to sparse in 1 line
sparse_vggt = make_vggt_sparse(regular_vggt, device="mps")

# Same usage, O(n) memory instead of O(nยฒ)
output = sparse_vggt(images)  # Handles 1000+ images!

๐Ÿ“Š Memory Scaling

Images Regular Sparse Savings
100 O(10K) O(1K) 10x
500 O(250K) O(5K) 50x
1000 O(1M) O(10K) 100x

See full results: docs/SPARSE_ATTENTION_RESULTS.md

๐Ÿ”ฌ Technical Details

MPS Optimizations

  • Device Detection: Auto-detects MPS availability
  • Dtype Selection: Uses float32 for optimal MPS performance
  • Autocast Handling: CUDA autocast disabled for MPS
  • Memory Management: Efficient tensor operations on Metal

Model Architecture

  • Parameters: 1B (5GB on disk)
  • Input: Multi-view images
  • Output: Depth, camera poses, 3D points
  • Resolution: 518x518 (VGGT), up to 1024x1024 (input)

๐Ÿ› Troubleshooting

MPS Not Available

# Check PyTorch MPS support
python -c "import torch; print(torch.backends.mps.is_available())"

Model Loading Issues

# Verify model file
ls -lh repo/vggt/vggt_model.pt
# Should show ~5GB file

Memory Issues

  • Reduce batch size
  • Lower resolution
  • Use CPU fallback

๐Ÿ“š References

๐Ÿ“š Documentation

๐Ÿš€ Release Notes

v2.0.0 (Latest)

  • โœจ Unified CLI with vggt command
  • ๐Ÿ“ฆ Professional Python packaging (PyPI-ready)
  • ๐ŸŒ Gradio web interface
  • ๐Ÿงช Comprehensive test suite
  • ๐Ÿ› ๏ธ Modern tooling (UV, Makefile, GitHub Actions)
  • ๐Ÿ“ Complete documentation overhaul

See full changelog

๐Ÿค Contributing

We follow a lightweight Git Flow:

  • main holds the latest stable release and is protected.
  • develop is the default integration branch for day-to-day work.

When contributing:

  1. Create your feature branch from develop (git switch develop && git switch -c feature/my-change).
  2. Keep commits focused and include tests or documentation updates when relevant.
  3. Open your pull request against develop; maintainers will promote changes to main during releases.

Please open issues for bugs or feature requests before starting large efforts. Full details, testing expectations, and the release process live in CONTRIBUTING.md.

๐Ÿ“„ License

MIT License - See LICENSE file for details

๐Ÿ™ Acknowledgments

  • Facebook Research for VGGT
  • Apple for Metal Performance Shaders
  • PyTorch team for MPS backend

Made with ๐ŸŽ for Apple Silicon by the AI community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_jmanhype_vggt_mps-2.0.0.tar.gz (63.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iflow_mcp_jmanhype_vggt_mps-2.0.0-py3-none-any.whl (59.5 kB view details)

Uploaded Python 3

File details

Details for the file iflow_mcp_jmanhype_vggt_mps-2.0.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_jmanhype_vggt_mps-2.0.0.tar.gz
  • Upload date:
  • Size: 63.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_jmanhype_vggt_mps-2.0.0.tar.gz
Algorithm Hash digest
SHA256 f2045414abb82c51bf41a69ba8aaac96a2521f529f5ee68eb7c5dc9b58848bdb
MD5 fe496fdbacedb7cee26278781bde3215
BLAKE2b-256 479dff27c620074e3df7a648f7e8f702639581426bfcb275d9539ea29770c592

See more details on using hashes here.

File details

Details for the file iflow_mcp_jmanhype_vggt_mps-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_jmanhype_vggt_mps-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 59.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_jmanhype_vggt_mps-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ad247714d209ea363cf10c02733dfd52d6b07430ec19216ea3886033dd7cb6a
MD5 f830b9cff544996c8e47a9e7226f3998
BLAKE2b-256 47ecc5014625edc586b4f32ba04b4957d79a1d23d0159e2d4ece74996f8d1195

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page