GPU-accelerated quantum tensor network simulator with adaptive MPS, TDVP, VQE/QAOA, and custom Triton kernels
Project description
ATLAS-Q: GPU-Accelerated Quantum Tensor Network Simulator
Adaptive Tensor Learning And Simulation – Quantum
Version 0.5.0 | October 2025
High-performance quantum simulation using GPU-accelerated tensor networks with custom Triton kernels
⚡ Performance Highlights
- 77K+ ops/sec gate throughput (GPU-optimized)
- 626,000× memory compression vs full statevector (30 qubits)
- 20× speedup on Clifford circuits (Stabilizer backend)
- 1.5-3× speedup on gate operations (custom Triton kernels)
- All 7/7 benchmarks passing
🚀 Quick Start
Option 1: Interactive Notebook (No Install!)
Try ATLAS-Q instantly in Google Colab or Jupyter:
📓 Open ATLAS_Q_Demo.ipynb in Colab
Or download and run locally:
wget https://github.com/followthesapper/ATLAS-Q/raw/ATLAS-Q/ATLAS_Q_Demo.ipynb
jupyter notebook ATLAS_Q_Demo.ipynb
Option 2: Python Package (Recommended)
# Install from PyPI
pip install atlas-quantum
# With GPU support
pip install atlas-quantum[gpu]
# Verify installation
python -c "from atlas_q import get_quantum_sim; print('✅ ATLAS-Q installed!')"
First example:
from atlas_q import get_quantum_sim
QCH, _, _, _ = get_quantum_sim()
sim = QCH()
factors = sim.factor_number(221)
print(f"221 = {factors[0]} × {factors[1]}") # 221 = 13 × 17
Option 3: Docker
GPU version (recommended):
docker pull ghcr.io/followthesapper/atlas-q:cuda
docker run --rm -it --gpus all ghcr.io/followthesapper/atlas-q:cuda python3
CPU version:
docker pull ghcr.io/followthesapper/atlas-q:cpu
docker run --rm -it ghcr.io/followthesapper/atlas-q:cpu python3
Run benchmarks in Docker:
docker run --rm --gpus all ghcr.io/followthesapper/atlas-q:cuda \
python3 /opt/atlas-q/scripts/benchmarks/validate_all_features.py
Option 4: From Source
# Clone repository
git clone https://github.com/followthsapper/ATLAS-Q.git
cd ATLAS-Q
# Install ATLAS-Q
pip install -e .[gpu]
# Setup GPU acceleration (auto-detects your GPU)
./setup_triton.sh
# Run benchmarks
python scripts/benchmarks/validate_all_features.py
GPU Acceleration Setup
The setup_triton.sh script automatically detects your GPU and configures Triton kernels:
- Auto-detects: V100, A100, H100, GB100/GB200, and future architectures
- Configures:
TORCH_CUDA_ARCH_LISTandTRITON_PTXAS_PATH - Persists: Adds settings to
~/.bashrc
Performance gains: 1.5-3× faster gate operations, 100-1000× faster period-finding
💡 Examples
Tensor Network Simulation
from atlas_q.adaptive_mps import AdaptiveMPS
import torch
# Create 10-qubit system with adaptive bond dimensions
mps = AdaptiveMPS(10, bond_dim=8, device='cuda')
# Apply Hadamard gates
H = torch.tensor([[1,1],[1,-1]], dtype=torch.complex64)/torch.sqrt(torch.tensor(2.0))
for q in range(10):
mps.apply_single_qubit_gate(q, H.to('cuda'))
# Apply CNOT gates
CNOT = torch.tensor([[1,0,0,0],[0,1,0,0],[0,0,0,1],[0,0,1,0]],
dtype=torch.complex64).reshape(4,4).to('cuda')
for q in range(0, 9, 2):
mps.apply_two_site_gate(q, CNOT)
print(f"Max bond dimension: {mps.stats_summary()['max_chi']}")
print(f"Memory usage: {mps.memory_usage() / (1024**2):.2f} MB")
Period-Finding & Factorization
from atlas_q import get_quantum_sim
# Get quantum classical hybrid simulator
QuantumClassicalHybrid, _, _, _ = get_quantum_sim()
qc = QuantumClassicalHybrid()
# Factor semiprimes
factors = qc.factor_number(143) # Returns [11, 13]
print(f"143 = {factors[0]} × {factors[1]}")
# Verified against canonical benchmarks:
# - IBM 2001 (N=15): ✅ Pass
# - Photonic 2012 (N=21): ✅ Pass
# - NMR 2012 (N=143): ✅ Pass
📊 Performance vs Competition
| Feature | ATLAS-Q | Qiskit Aer | Cirq | Winner |
|---|---|---|---|---|
| Memory (30q) | 0.03 MB | 16 GB | 16 GB | ATLAS-Q (626k×) |
| GPU Support | ✅ Triton | ✅ cuQuantum | ❌ | ATLAS-Q |
| Stabilizer | 20× speedup | Standard | Standard | ATLAS-Q |
| Tensor Networks | ✅ Native | ❌ | ❌ | ATLAS-Q |
| Ease of Use | Good | Excellent | Excellent | Qiskit/Cirq |
Note: Run python scripts/benchmarks/compare_with_competitors.py for detailed performance comparisons
🎯 What is ATLAS-Q?
ATLAS-Q is a GPU-accelerated quantum simulator with two complementary capabilities:
Tensor Network Simulation
- Adaptive MPS: Memory-efficient quantum state representation (O(n·χ²) vs O(2ⁿ))
- NISQ Algorithms: VQE, QAOA with noise models
- Time Evolution: TDVP for Hamiltonian dynamics
- Specialized Backends: Stabilizer for Clifford circuits, MPO for observables
- GPU Acceleration: Custom Triton kernels + cuBLAS tensor cores
Period-Finding & Factorization
- Shor's Algorithm: Integer factorization via quantum period-finding
- Compressed States: Periodic states (O(1) memory), product states (O(n) memory)
- Verified Results: Matches canonical benchmarks (N=15, 21, 143)
Key Innovations
- ✅ Custom Triton Kernels: Fused gate operations for 1.5-3× speedup
- ✅ Adaptive Bond Dimensions: Dynamic memory management based on entanglement
- ✅ Hybrid Stabilizer/MPS: 20× faster Clifford circuits with automatic switching
- ✅ GPU-Optimized Einsums: cuBLAS + tensor cores for tensor contractions
- ✅ Specialized Representations: O(1) memory for periodic states, O(n) for product states
📚 Documentation
Interactive Tutorial
- 📓 Jupyter Notebook - Complete interactive demo (works in Colab!)
Online Documentation
- 📖 Documentation Site - Browse all docs online
Guides & References
- Complete Guide - Installation, tutorials, API reference (start here!)
- Feature Status - What's actually implemented
- Research Paper - Mathematical foundations and algorithms
- Whitepaper - Technical architecture and implementation
- Overview - High-level explanation for all audiences
🏗️ Architecture
Core Components
ATLAS-Q/
├── src/atlas_q/
│ ├── adaptive_mps.py # Adaptive MPS with GPU support
│ ├── quantum_hybrid_system.py # Period-finding & factorization
│ ├── mpo_ops.py # MPO operations (Hamiltonians)
│ ├── tdvp.py # Time evolution (TDVP)
│ ├── vqe_qaoa.py # Variational algorithms
│ ├── stabilizer_backend.py # Fast Clifford simulation
│ ├── noise_models.py # NISQ noise models
│ ├── peps.py # 2D tensor networks
│ └── tools_qih/ # Quantum-inspired ML
├── triton_kernels/
│ ├── mps_complex.py # Custom Triton kernels (1.5-3× faster)
│ ├── mps_ops.py # MPS tensor operations
│ └── modpow.py # Modular exponentiation
├── benchmarks/
│ ├── comprehensive_benchmark.py # 7/7 tensor network benchmarks
│ └── competitive_comparison.py # vs Qiskit/Cirq/ITensor
└── tests/ # 75+ unit tests
Technology Stack
- PyTorch 2.10+ (CUDA backend)
- Triton (custom GPU kernels)
- cuBLAS/CUTLASS (tensor cores)
- NumPy/SciPy (linear algebra)
🎓 Use Cases
✅ BEST FOR:
- Tensor Networks: 20-50 qubits with moderate entanglement
- VQE/QAOA: Optimization on NISQ devices with noise
- Time Evolution: Hamiltonian dynamics via TDVP
- Period-Finding: Shor's algorithm for integer factorization
- Memory-Constrained: 626,000× compression vs statevector
- GPU Workloads: Custom Triton kernels + cuBLAS
⚠️ NOT IDEAL FOR:
- Highly entangled states (use full statevector)
- Arbitrary connectivity (MPS assumes 1D/2D structure)
- CPU-only environments
📈 Benchmark Results
Internal Benchmarks (All Passing)
✅ Benchmark 1: Noise Models - 3/3 passing
✅ Benchmark 2: Stabilizer Backend - 3/3 passing (20× speedup)
✅ Benchmark 3: MPO Operations - 3/3 passing
✅ Benchmark 4: TDVP Time Evolution - 2/2 passing
✅ Benchmark 5: VQE/QAOA - 2/2 passing
✅ Benchmark 6: 2D Circuits - 2/2 passing
✅ Benchmark 7: Integration Tests - 2/2 passing
Key Metrics
| Metric | Value | Notes |
|---|---|---|
| Gate throughput | 77,304 ops/sec | GPU-optimized |
| Stabilizer speedup | 20.4× | vs generic MPS |
| MPO evaluations | 1,372/sec | Hamiltonian expectations |
| VQE time (6q) | 1.68s | 50 iterations |
| Memory (30q) | 0.03 MB | vs 16 GB statevector |
🔬 Example Applications
VQE for Quantum Chemistry
from atlas_q.vqe_qaoa import VQE, VQEConfig
from atlas_q.mpo_ops import MPOBuilder
# Build Heisenberg Hamiltonian
H = MPOBuilder.heisenberg_hamiltonian(n_sites=6, device='cuda')
# Configure VQE
config = VQEConfig(n_layers=3, max_iter=50)
vqe = VQE(H, config)
# Run optimization
energy, params = vqe.run()
print(f"Ground state energy: {energy:.6f}")
TDVP Time Evolution
from atlas_q.tdvp import TDVP1Site, TDVPConfig
from atlas_q.mpo_ops import MPOBuilder
from atlas_q.adaptive_mps import AdaptiveMPS
# Create Hamiltonian and initial state
H = MPOBuilder.ising_hamiltonian(n_sites=10, J=1.0, h=0.5, device='cuda')
mps = AdaptiveMPS(10, bond_dim=8, device='cuda')
# Configure TDVP
config = TDVPConfig(dt=0.01, t_final=1.0, use_gpu_optimized=True)
tdvp = TDVP1Site(H, mps, config)
# Run time evolution
times, energies = tdvp.run()
🚧 Roadmap
Current Status (v0.5.0)
- ✅ GPU-accelerated tensor networks with custom Triton kernels
- ✅ Adaptive MPS with error tracking
- ✅ Stabilizer backend (20× speedup)
- ✅ TDVP, VQE/QAOA implementations
- ✅ All 7/7 benchmark suites passing
Planned Features
- Multi-GPU distributed MPS
- Enhanced PEPS implementation
- Integration adapters for Qiskit/Cirq circuits
- Extended cuQuantum backend support
- Additional tutorial notebooks
- PyPI package distribution
🤝 Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
Development Setup
# Clone with submodules
git clone --recursive https://github.com/followthsapper/ATLAS-Q.git
# Install dev dependencies
pip install -r requirements.txt
pip install pytest pytest-cov black isort
# Run tests
pytest tests/ -v
# Run benchmarks
python scripts/benchmarks/validate_all_features.py
📝 Citation
If you use ATLAS-Q in your research, please cite:
@software{atlasq2025,
title={ATLAS-Q: Adaptive Tensor Learning And Simulation – Quantum},
author={ATLAS-Q Development Team},
year={2025},
url={https://github.com/followthsapper/ATLAS-Q},
version={0.5.0}
}
📄 License
MIT License - see LICENSE for details
🙏 Acknowledgments
- PyTorch team for GPU infrastructure
- Triton team for custom kernel framework
- ITensor/TeNPy for tensor network inspiration
- Qiskit/Cirq for quantum computing ecosystem
📞 Contact
- Issues: GitHub Issues
- Discussions: GitHub Discussions
ATLAS-Q: GPU-accelerated tensor network simulator achieving 626,000× memory compression through adaptive MPS, custom Triton kernels, and specialized quantum state representations.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file atlas_quantum-0.5.0.tar.gz.
File metadata
- Download URL: atlas_quantum-0.5.0.tar.gz
- Upload date:
- Size: 2.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b5ce606b2876b496b1eb8cb38f3e5aa55c2317133021e770db578ca9001bee2
|
|
| MD5 |
d5a775e41bff668a477a675482bc406a
|
|
| BLAKE2b-256 |
803b625586ecc0ee8df8899789c64d3ea9548c1254d34ce97b44ceece4f6c5e8
|
File details
Details for the file atlas_quantum-0.5.0-py3-none-any.whl.
File metadata
- Download URL: atlas_quantum-0.5.0-py3-none-any.whl
- Upload date:
- Size: 105.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9f8d4e55b226a91edfbea50700963b8e114f6039b2979da4dd344b254731600
|
|
| MD5 |
c9dfad459a1a786bde5ff7d6b6f89a0e
|
|
| BLAKE2b-256 |
73907b5c8bb4f20d94636f9ea85ee6efd8a47e7c0f4d383d12bcc250506dcc27
|