Skip to main content

Manifold: Geometric Intelligence via Symplectic Geodesic Flows | GFN: Geometric Flow Networks

Project description

GFN | Manifold

Geodesic Flow Networks: Geometric Intelligence via Symplectic Flows

VERSION License Physics Documentation


What's New in v2.5.0 (Riemannian Stability)

  • Riemannian Optimization: RiemannianAdam optimizer ensures parameter updates respect manifold geometry
  • Adaptive Curvature Gating: Learnable valve mechanism enables inertial coasting when optimal
  • Zero-Force Inductive Bias: Architectural enforcement of E(0) = 0 for perfect state preservation
  • Velocity Normalization: Automatic stabilization preserving memory direction while controlling magnitude

Overview

GFN (Geodesic Flow Networks), publicly known as Manifold, reformulates sequence modeling as geodesic flow on a learned Riemannian manifold. Instead of attention matrices (O(N²)) or fixed-state compression, GFN models the latent state as a physical particle governed by symplectic integrators, enabling O(1) memory with infinite horizon stability.

Core Innovation: State transitions follow Einstein's geodesic equation with learned curvature, ensuring information conservation via Hamiltonian dynamics.


Installation

pip install gfn

Or install from source:

git clone https://github.com/Manifold-Laboratory/manifold.git
cd manifold
pip install -e "."

Requirements: Python 3.10+, PyTorch 2.0+, CUDA (optional)


Quick Start

from src.model import Manifold
from src.optim import RiemannianAdam

# Model
model = Manifold(
    vocab_size=50257,
    dim=512,
    depth=12,
    heads=8,
    integrator_type='leapfrog'
).cuda()

# Optimizer (REQUIRED: standard Adam will fail)
optimizer = RiemannianAdam(
    model.parameters(),
    lr=1e-4,
    retraction='normalize',
    max_norm=10.0
)

# Training
for x, y in dataloader:
    optimizer.zero_grad()
    logits, _, _ = model(x)
    loss = criterion(logits.view(-1, vocab_size), y.view(-1))
    loss.backward()
    torch.nn.utils.clip_grad_norm_(model.parameters(), 0.05)
    optimizer.step()

Verified Performance

Binary Parity Task (Cumulative XOR)

Challenge: Predict cumulative XOR over arbitrarily long sequences (requires infinite-precision state tracking)

Training Performance

Model Steps to Convergence Final Loss Training Time Final Accuracy
GFN 728 0.00494 47 min (L=20) 99.9%
MicroGPT 4,000 0.0254 1m 27s (L=20) 99.0%

GFN achieves lower loss (0.00494 vs 0.0254) and higher accuracy despite longer training time

Zero-Shot Generalization Results

Trained on L=20 only, tested on sequences up to L=1000 (50× longer):

Parity Generalization

Figure: Left plot shows perfect accuracy generalization. Right plot demonstrates O(1) memory scaling (flat line) vs theoretical O(N) baseline.

Detailed Results:

Test Length GFN Accuracy GFN VRAM MicroGPT Accuracy MicroGPT VRAM
20 (seen) 100.0% 28.3 MB 98.0% 44.7 MB
50 100.0% 28.4 MB 49.5% (collapsed) 73.7 MB
100 100.0% 28.6 MB 50.1% (random) 156.0 MB
200 100.0% 29.0 MB 51.8% (random) 420.9 MB
400 100.0% 29.8 MB 49.9% (random) 1,363 MB (1.3GB)
500 100.0% 30.4 MB 49.1% (random) 2,040 MB (2.0GB)
1000 100.0% 32.1 MB 50.7% (random) 7,488 MB (7.3GB)
10000 100.0% 60.3 MB FAILED (OOM) > 8GB

Key Findings:

  • Perfect Generalization: 100% accuracy on all lengths including L=10,000 (500× training length)
  • O(1) Memory Verified: VRAM growth of only 32 MB (113%) from L=20→10,000
  • Transformer Collapse: MicroGPT accuracy drops to random chance (50%) immediately at L=50
  • Memory Advantage: At L=1000, GFN uses 32MB vs Transformer's 7.5GB (234× less memory)

Full benchmark results and plots available in tests/benchmarks/results/gfn_superiority/


Core Architecture

Geodesic Equation

d²x/dτ² + Γ(v, x) = F_ext(token)
  • x: Position in semantic manifold
  • v: Velocity (momentum/memory)
  • Γ: Christoffel symbols (learned curvature)
  • F: Input token force

Symplectic Integration (Leapfrog)

# Half-step velocity
v_half = v + 0.5 * dt * (F - Γ(v, x))

# Full-step position
x_next = x + dt * v_half

# Half-step velocity finalization
v_next = v_half + 0.5 * dt * (F - Γ(v_half, x_next))

# Stabilization
v_next = v_next / (||v_next|| + ε)

Properties:

  • Time-reversible
  • Volume-preserving (det(J) = 1)
  • Energy-conserving (|ΔH| ≈ O(dt²))

Comparison with Baselines

Architecture Memory (Inference) Compute (per token) Gradient Stability Verified
Transformer O(N) KV cache O(N·d) attention Good
LSTM/GRU O(1) O(d²) gates Poor
Mamba (SSM) O(1) O(d²) state update Medium
GFN O(1) state O(d²·R) Christoffel Excellent

Where N = sequence length, d = hidden dim, R = Christoffel rank (typically 16-32)

Note: GFN's O(d²·R) per-token cost is comparable to LSTMs/Mamba. For training full sequences, all architectures are O(N·...) in sequence length.


Documentation


Use Cases

  • Long-Context Reasoning: Process sequences >10K tokens with constant memory
  • Algorithmic Tasks: Perfect extrapolation on logical reasoning (XOR, sorting, arithmetic)
  • Edge Deployment: Run large models on memory-constrained devices (<4GB RAM)
  • Scientific Computing: Model systems requiring conservation laws (physics simulations)

Repository Structure

/
├── src/                # Core Implementation
│   ├── model.py        # Main Manifold Architecture
│   ├── geometry.py     # Christoffel Symbols & Integrators
│   ├── layers.py       # M-Layer (Manifold Layer)
│   ├── embeddings.py   # Functional Embeddings
│   └── optim.py        # RiemannianAdam Optimizer
├── docs/               # Technical Documentation
│   ├── SCIENTIFIC_PAPER.md
│   ├── API.md
│   ├── TRAINING.md
│   └── BENCHMARKS.md
├── tests/              # Verification Suite
│   └── benchmarks/     # Reproducible Benchmarks
└── LICENSE             # Apache 2.0

Development Status

Version 2.5.0 is production-ready for research and experimentation.

Verified:

  • ✅ O(1) memory scaling (empirically confirmed)
  • ✅ Perfect generalization on Parity task
  • ✅ Stable training with RiemannianAdam
  • ✅ Symplectic gradient flow

In Development:

  • CUDA kernel acceleration (10-50× speedup expected)
  • Mixed precision training (FP16/BF16)
  • Language modeling benchmarks (WikiText)
  • Mixture of Manifolds (MoM) architecture

Citation

If you use GFN in your research, please cite:

@software{gfn2026,
  title={GFN: Geodesic Flow Networks},
  author={Stürtz Joaquín},
  year={2026},
  version={2.5.0},
  url={https://github.com/Manifold-Laboratory/manifold},
  license={Apache-2.0}
}

Contributing

See CONTRIBUTING.md for guidelines.

Quick Links:


License

Apache License 2.0 - See LICENSE for details.


Manifold Laboratory
Geometric intelligence through physical principles.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gfn-2.5.0.tar.gz (53.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gfn-2.5.0-cp313-cp313-win_amd64.whl (161.4 kB view details)

Uploaded CPython 3.13Windows x86-64

File details

Details for the file gfn-2.5.0.tar.gz.

File metadata

  • Download URL: gfn-2.5.0.tar.gz
  • Upload date:
  • Size: 53.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for gfn-2.5.0.tar.gz
Algorithm Hash digest
SHA256 64edf05e2350fc2a6bddd7aff9904dbe10ad0a6946f85a4da51ff0721e9e9dbd
MD5 fde17dbdfac2e3b72fcd709ad745f8b6
BLAKE2b-256 912834628e45fa0c1c5673844522761b5b045903fbca6a88f291aa7e19d3fd06

See more details on using hashes here.

File details

Details for the file gfn-2.5.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: gfn-2.5.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 161.4 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for gfn-2.5.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 b0d5958838c54347dc8bc1311797abefab035fdf2f7455cd58f3bdf8e8540d77
MD5 c583b644c0ac828df4eefce95edd78e1
BLAKE2b-256 f0e8ba7300563fb201a43ccbd6cba85a7805d0406a38b10126844835523bee98

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page