VGGT 3D reconstruction optimized for Apple Silicon with sparse attention
Project description
VGGT-MPS: 3D Vision Agent for Apple Silicon
๐ VGGT (Visual Geometry Grounded Transformer) optimized for Apple Silicon with Metal Performance Shaders (MPS)
Transform single or multi-view images into rich 3D reconstructions using Facebook Research's VGGT model, now accelerated on M1/M2/M3 Macs.
๐ Release v2.0.0
Major Update: Complete packaging overhaul with unified CLI, PyPI-ready distribution, and production-grade tooling!
โจ What's New in v2.0.0
๐ฏ Major Changes
- Unified CLI: New
vggtcommand with subcommands for all operations - Professional Packaging: PyPI-ready with
pyproject.toml, proper src layout - Web Interface: Gradio UI for interactive 3D reconstruction (
vggt web) - Enhanced Testing: Comprehensive test suite with MPS and sparse attention tests
- Modern Tooling: UV support, Makefile automation, GitHub Actions CI/CD
๐ Core Features
- MPS Acceleration: Full GPU acceleration on Apple Silicon using Metal Performance Shaders
- โก Sparse Attention: O(n) memory scaling for city-scale reconstruction (100x savings!)
- ๐ฅ Multi-View 3D Reconstruction: Generate depth maps, point clouds, and camera poses from images
- ๐ง MCP Integration: Model Context Protocol server for Claude Desktop integration
- ๐ฆ 5GB Model: Efficient 1B parameter model that runs smoothly on Apple Silicon
- ๐ ๏ธ Multiple Export Formats: PLY, OBJ, GLB for 3D point clouds
๐ฏ What VGGT Does
VGGT reconstructs 3D scenes from images by predicting:
- Depth Maps: Per-pixel depth estimation
- Camera Poses: 6DOF camera parameters
- 3D Point Clouds: Dense 3D reconstruction
- Confidence Maps: Reliability scores for predictions
๐ Requirements
- Apple Silicon Mac (M1/M2/M3)
- Python 3.10+
- 8GB+ RAM
- 6GB disk space for model
๐ Quick Start
Installation Options
Option A: Install from PyPI (Coming Soon)
# Install from PyPI (when published)
pip install vggt-mps
# Download model weights (5GB)
vggt download
Option B: Install from Source with UV (Recommended for Development)
git clone https://github.com/jmanhype/vggt-mps.git
cd vggt-mps
# Install with uv (10-100x faster than pip!)
make install
# Or manually with uv
uv pip install -e .
Option C: Traditional pip install from Source
git clone https://github.com/jmanhype/vggt-mps.git
cd vggt-mps
# Create virtual environment
python -m venv vggt-env
source vggt-env/bin/activate
# Install dependencies
pip install -r requirements.txt
2. Download Model Weights
# Download the 5GB VGGT model
vggt download
# Or if running from source:
python main.py download
Or manually download from Hugging Face
3. Test MPS Support
# Test MPS acceleration
vggt test --suite mps
# Or from source:
python main.py test --suite mps
Expected output:
โ
MPS (Metal Performance Shaders) available!
Running on Apple Silicon GPU
โ
Model weights loaded to mps
โ
MPS operations working correctly!
4. Setup Environment (Optional)
# Copy environment configuration
cp .env.example .env
# Edit .env with your settings
nano .env
๐ Usage
CLI Commands (v2.0.0)
All functionality is accessible through the unified vggt command:
# Quick demo with sample images
vggt demo
# Demo with kitchen dataset (4 images)
vggt demo --kitchen --images 4
# Process your own images
vggt reconstruct data/*.jpg
# Use sparse attention for large scenes
vggt reconstruct --sparse data/*.jpg
# Export to specific format
vggt reconstruct --export ply data/*.jpg
# Launch interactive web interface
vggt web
# Open on specific port with public link
vggt web --port 8080 --share
# Run comprehensive tests
vggt test --suite all
# Test sparse attention specifically
vggt test --suite sparse
# Benchmark performance
vggt benchmark --compare
# Download model weights
vggt download
From Source (Development)
If running from source without installation:
python main.py demo
python main.py reconstruct data/*.jpg
python main.py web
python main.py test --suite mps
python main.py benchmark --compare
๐ง MCP Server Integration
Add to Claude Desktop
- Edit
~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"vggt-agent": {
"command": "uv",
"args": [
"run",
"--python",
"/path/to/vggt-mps/vggt-env/bin/python",
"--with",
"fastmcp",
"fastmcp",
"run",
"/path/to/vggt-mps/src/vggt_mps_mcp.py"
]
}
}
}
- Restart Claude Desktop
Available MCP Tools
vggt_quick_start_inference- Quick 3D reconstruction from imagesvggt_extract_video_frames- Extract frames from videovggt_process_images- Full VGGT pipelinevggt_create_3d_scene- Generate GLB 3D filesvggt_reconstruct_3d_scene- Multi-view reconstructionvggt_visualize_reconstruction- Create visualizations
๐ Project Structure
vggt-mps/
โโโ main.py # Single entry point
โโโ setup.py # Package installation
โโโ requirements.txt # Dependencies
โโโ .env.example # Environment configuration
โ
โโโ src/ # Source code
โ โโโ config.py # Centralized configuration
โ โโโ vggt_core.py # Core VGGT processing
โ โโโ vggt_sparse_attention.py # Sparse attention (O(n) scaling)
โ โโโ visualization.py # 3D visualization utilities
โ โ
โ โโโ commands/ # CLI commands
โ โ โโโ demo.py # Demo command
โ โ โโโ reconstruct.py # Reconstruction command
โ โ โโโ test_runner.py # Test runner
โ โ โโโ benchmark.py # Performance benchmarking
โ โ โโโ web_interface.py # Gradio web app
โ โ
โ โโโ utils/ # Utilities
โ โโโ model_loader.py # Model management
โ โโโ image_utils.py # Image processing
โ โโโ export.py # Export to PLY/OBJ/GLB
โ
โโโ tests/ # Organized test suite
โ โโโ test_mps.py # MPS functionality tests
โ โโโ test_sparse.py # Sparse attention tests
โ โโโ test_integration.py # End-to-end tests
โ
โโโ data/ # Input data directory
โโโ outputs/ # Output directory
โโโ models/ # Model storage
โ
โโโ docs/ # Documentation
โ โโโ API.md # API documentation
โ โโโ SPARSE_ATTENTION.md # Technical details
โ โโโ BENCHMARKS.md # Performance results
โ
โโโ LICENSE # MIT License
๐ผ๏ธ Usage Examples
Process Images
from src.tools.readme import vggt_quick_start_inference
result = vggt_quick_start_inference(
image_directory="./tmp/inputs",
device="mps", # Use Apple Silicon GPU
max_images=4,
save_outputs=True
)
Extract Video Frames
from src.tools.demo_gradio import vggt_extract_video_frames
result = vggt_extract_video_frames(
video_path="input_video.mp4",
frame_interval_seconds=1.0
)
Create 3D Scene
from src.tools.demo_viser import vggt_reconstruct_3d_scene
result = vggt_reconstruct_3d_scene(
images_dir="./tmp/inputs",
device_type="mps",
confidence_threshold=0.5
)
โก Sparse Attention - NEW!
City-scale 3D reconstruction is now possible! We've implemented Gabriele Berton's research idea for O(n) memory scaling.
๐ฏ Key Benefits
- 100x memory savings for 1000 images
- No retraining required - patches existing VGGT at runtime
- Identical outputs to regular VGGT (0.000000 difference)
- MegaLoc covisibility detection for smart attention masking
๐ Usage
from src.vggt_sparse_attention import make_vggt_sparse
# Convert any VGGT to sparse in 1 line
sparse_vggt = make_vggt_sparse(regular_vggt, device="mps")
# Same usage, O(n) memory instead of O(nยฒ)
output = sparse_vggt(images) # Handles 1000+ images!
๐ Memory Scaling
| Images | Regular | Sparse | Savings |
|---|---|---|---|
| 100 | O(10K) | O(1K) | 10x |
| 500 | O(250K) | O(5K) | 50x |
| 1000 | O(1M) | O(10K) | 100x |
See full results: docs/SPARSE_ATTENTION_RESULTS.md
๐ฌ Technical Details
MPS Optimizations
- Device Detection: Auto-detects MPS availability
- Dtype Selection: Uses float32 for optimal MPS performance
- Autocast Handling: CUDA autocast disabled for MPS
- Memory Management: Efficient tensor operations on Metal
Model Architecture
- Parameters: 1B (5GB on disk)
- Input: Multi-view images
- Output: Depth, camera poses, 3D points
- Resolution: 518x518 (VGGT), up to 1024x1024 (input)
๐ Troubleshooting
MPS Not Available
# Check PyTorch MPS support
python -c "import torch; print(torch.backends.mps.is_available())"
Model Loading Issues
# Verify model file
ls -lh repo/vggt/vggt_model.pt
# Should show ~5GB file
Memory Issues
- Reduce batch size
- Lower resolution
- Use CPU fallback
๐ References
๐ Documentation
- Development Guide - Setting up your dev environment
- Publishing Guide - PyPI release process
- Contributing Guide - How to contribute
- API Documentation - Detailed API reference
- Examples - Code examples and demos
๐ Release Notes
v2.0.0 (Latest)
- โจ Unified CLI with
vggtcommand - ๐ฆ Professional Python packaging (PyPI-ready)
- ๐ Gradio web interface
- ๐งช Comprehensive test suite
- ๐ ๏ธ Modern tooling (UV, Makefile, GitHub Actions)
- ๐ Complete documentation overhaul
See full changelog
๐ค Contributing
We follow a lightweight Git Flow:
mainholds the latest stable release and is protected.developis the default integration branch for day-to-day work.
When contributing:
- Create your feature branch from
develop(git switch develop && git switch -c feature/my-change). - Keep commits focused and include tests or documentation updates when relevant.
- Open your pull request against
develop; maintainers will promote changes tomainduring releases.
Please open issues for bugs or feature requests before starting large efforts. Full details, testing expectations, and the release process live in CONTRIBUTING.md.
๐ License
MIT License - See LICENSE file for details
๐ Acknowledgments
- Facebook Research for VGGT
- Apple for Metal Performance Shaders
- PyTorch team for MPS backend
Made with ๐ for Apple Silicon by the AI community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iflow_mcp_jmanhype_vggt_mps-2.0.0.tar.gz.
File metadata
- Download URL: iflow_mcp_jmanhype_vggt_mps-2.0.0.tar.gz
- Upload date:
- Size: 63.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2045414abb82c51bf41a69ba8aaac96a2521f529f5ee68eb7c5dc9b58848bdb
|
|
| MD5 |
fe496fdbacedb7cee26278781bde3215
|
|
| BLAKE2b-256 |
479dff27c620074e3df7a648f7e8f702639581426bfcb275d9539ea29770c592
|
File details
Details for the file iflow_mcp_jmanhype_vggt_mps-2.0.0-py3-none-any.whl.
File metadata
- Download URL: iflow_mcp_jmanhype_vggt_mps-2.0.0-py3-none-any.whl
- Upload date:
- Size: 59.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ad247714d209ea363cf10c02733dfd52d6b07430ec19216ea3886033dd7cb6a
|
|
| MD5 |
f830b9cff544996c8e47a9e7226f3998
|
|
| BLAKE2b-256 |
47ecc5014625edc586b4f32ba04b4957d79a1d23d0159e2d4ece74996f8d1195
|