Skip to main content

BANKAI: Bond-vector ANalysis of Kinetic Amino acid Initiator - GPU-accelerated sub-picosecond causal cascade detection in GROMACS trajectories

Project description

BANKAI(Bond-vector ANalysis of Kinetic Amino acid Initiator)

GPU-accelerated sub-picosecond causal cascade detection in GROMACS molecular dynamics trajectories.

Python 3.9+ License: MIT DOI


Overview

Conventional MD analysis reduces raw atomic coordinates to summary metrics like RMSD or RMSF — discarding the vast majority of structural information before analysis even begins. BANKAI takes the opposite approach: it operates directly on the full atomic coordinate trajectory from GROMACS, analyzing every atom at every timestep without dimensionality reduction.

At 0.01 picosecond (10 femtosecond) resolution, thermal noise dominates the signal by 10:1. BANKAI's 4-layer statistical filtering architecture strips away thermal fluctuations layer by layer, achieving a final S/N ratio exceeding 100:1 — making it possible to detect statistically significant structural events that are completely invisible to conventional tools.

Core equation:

ΔΛC = ρT · σS · |ΛF|

Where ΔΛC is the causal cascade event, ρT is tension density, σS is structural synchronization, and ΛF is the lambda field vector.

Why BANKAI?

The problem: At 0.01 ps resolution, thermal energy at 300K (kT ≈ 2.5 kJ/mol) causes atomic vibrations that completely bury biologically meaningful structural changes. Traditional tools either avoid this timescale entirely, or collapse the data into aggregate metrics (RMSD, radius of gyration) that erase the very signals you need.

BANKAI's answer: Don't reduce — resolve. Analyze the full atomic coordinate tensor directly, and use physics-informed statistical filtering to separate signal from noise.

Key Features

  • Full-atom analysis — No RMSD reduction. Every atom, every frame, raw coordinates
  • Sub-picosecond resolution — 0.01 ps intervals, 100,001 frames per nanosecond
  • 4-layer noise filtering — Thermal fluctuation removal achieving S/N > 100:1
  • 20+ custom CUDA kernels — Not just CuPy wrappers; hand-optimized GPU kernels with shared memory, coalesced access, and atomic operations for 150–200x speedup over CPU
  • Automatic CPU fallback — Works without GPU (slower but fully functional)
  • Two-stage analysis — Macro-level event detection → Residue-level causal tracing
  • Genesis atom identification — Pinpoints the first atom to trigger a cascade event
  • Phase space dynamics — Lyapunov exponents, attractor characterization, recurrence quantification
  • Causal network mapping — Residue-to-residue causality with confidence scoring
  • Built-in visualization — Publication-ready plots and interactive 3D networks

4-Layer Analysis Architecture

Input: 0.01 ps GROMACS trajectory (full atomic coordinates, thermal noise dominant)
  │
  ├─ Layer 1: Λ³ Structural Analysis ──── Multi-scale statistical filtering (~80% noise reduction)
  │   • 3 concurrent timescales (σ₁=short, σ₂=mid, σ₃=long)
  │   • Adaptive 3σ–5σ significance thresholds
  │
  ├─ Layer 2: Topological Break Detection ── Structural continuity monitoring (~60% residual removal)
  │   • Q_lambda (topological charge): winding number of ΛF flow
  │   • Irreversible vs reversible change discrimination
  │
  ├─ Layer 3: 3-Axis Anomaly Scoring ──── Physics-based validation
  │   • Spatial: directional/cooperative vs random/isotropic
  │   • Synchronization: correlated (>0.6) vs uncorrelated (<0.3)
  │   • Temporal: Maxwell-Boltzmann deviation (>3σ)
  │
  └─ Layer 4: Phase Space Attractor Analysis ── Deterministic dynamics extraction
      • Lyapunov exponents, correlation dimension
      • Recurrence Quantification Analysis (RQA)
      • Attractor compactness vs diffusive noise

Output: Statistically significant structural events with confidence scores
        (S/N > 100:1, configurable 95%–99.9% confidence)

NOTICE

Trajectory output configuration. BANKAI-MD operates exclusively on atomic coordinate data; velocity, force, and energy outputs are neither required nor utilized by the framework. For sub-picosecond analysis at 0.01 ps resolution with a standard 2 fs integration timestep, the recommended GROMACS .mdp settings are:

nstxout            = 0      ; suppress uncompressed coordinate output
nstvout            = 0      ; velocities not required
nstfout            = 0      ; forces not required  
nstxout-compressed = 5      ; XTC output every 5 steps (= 0.01 ps)
compressed-x-precision = 1000  ; high-precision lossy compression

This configuration is critical for practical feasibility. At 0.01 ps resolution, a 10 ns trajectory of a 5,724-atom system generates approximately 200,000 frames. Using compressed XTC format, this produces trajectory files on the order of tens of gigabytes—manageable on consumer-grade storage. The equivalent uncompressed TRR output (coordinates, velocities, and forces) would exceed 700 GB for the same trajectory, rendering sub-picosecond analysis impractical for most research groups despite being computationally feasible. Since BANKAI-MD derives all kinematic quantities (ΛF\Lambda_F ΛF​, structural velocity, acceleration) directly from frame-to-frame coordinate differences, the velocity and force arrays stored in TRR files provide no additional information to the analysis pipeline.

Installation

From PyPI

pip install bankai-md

From source

git clone https://github.com/miosync-masa/bankai.git
cd bankai
pip install -e .

With GPU support

# CUDA 12.x
pip install -e ".[cuda12]"

# CUDA 11.x
pip install -e ".[cuda11]"

# CUDA 12.5+ (compatibility mode)
pip install -e ".[cuda12-compat]"

# Full (CUDA 12 + visualization + dev tools)
pip install -e ".[full]"

Google Colab

# Step 0: Sample data setup
!pip install gdown -q
import os
import gdown

folder_url = 'https://drive.google.com/drive/folders/1AaS6NA8aCUfIrQArltNERNUotW6Pcayq?usp=drive_link'
folder_id = folder_url.split('/')[-1].split('?')[0]
destination_folder = '/content/'
os.makedirs(destination_folder, exist_ok=True)
gdown.download_folder(
    f'https://drive.google.com/drive/folders/{folder_id}',
    output=destination_folder,
    quiet=False,
    use_cookies=False
)

# Step 1: Install CUDA Toolkit
!apt-get install -y cuda-toolkit-12-2

# Step 2: Configure CUDA environment
os.environ['CUDA_HOME'] = '/usr/local/cuda-12.2'
os.environ['PATH'] = '/usr/local/cuda-12.2/bin:' + os.environ['PATH']
os.environ['LD_LIBRARY_PATH'] = '/usr/local/cuda-12.2/lib64:' + os.environ.get('LD_LIBRARY_PATH', '')

# Step 3: Install GPU backend 
!pip install cupy-cuda12x==12.3.0 --no-cache-dir
!pip install xarray==2023.7.0
!pip install pylibraft-cu12==24.10.0

# Step 4: Install BANKAI
!pip install bankai-md

# Step 5: Run full analysis
import warnings
warnings.filterwarnings('ignore')

from bankai.analysis.run_full_analysis import run_quantum_validation_pipeline

results = run_quantum_validation_pipeline(
    trajectory_path='/content/demo_gromacs/trajectory_stable.npy',
    metadata_path='/content/demo_gromacs/metadata_stable.json',
    protein_indices_path='/content/demo_gromacs/protein_stable.npy',
    topology_path=None,
    enable_two_stage=True,
    enable_third_impact=True,
    enable_visualization=True,
    output_dir='./gromacs_results_v4',
    verbose=True,
    atom_mapping_path='/content/demo_gromacs/residue_atom_mapping.json',
    third_impact_top_n=10
)

⚠️ Troubleshooting: Depending on the Colab runtime version, dependency conflicts may occur. If you encounter errors, try:

!pip install xarray==2023.7.0
!pip install pylibraft-cu12==24.10.0

Requirements

  • Python 3.9+
  • CUDA Compute Capability 7.0+ (V100, A100, H100, RTX series)
  • CuPy 12.0+ (matched to your CUDA version)
  • NumPy < 2.0.0
  • GROMACS trajectory data (.npy format)

Quick Start

CLI

# Show help
bankai --help

# Check GPU environment
bankai info

# Run analysis with sample data
bankai example

# Run analysis on your data
bankai run trajectory.npy --protein-indices protein.npy --metadata metadata.json

# Full two-stage analysis
bankai full trajectory.npy --protein-indices protein.npy \
    --events 5000:10000:unfolding 20000:25000:aggregation \
    --n-residues 129

Python API

import bankai
from bankai import MDLambda3DetectorGPU, MDConfig

# Configure
config = MDConfig()
config.use_extended_detection = True
config.use_phase_space = True

# Initialize detector (auto GPU/CPU selection)
detector = MDLambda3DetectorGPU(config)

# Run analysis
result = detector.analyze(trajectory, backbone_indices)

# Visualize
from bankai.visualization import Lambda3VisualizerGPU
visualizer = Lambda3VisualizerGPU()
fig = visualizer.visualize_results(result)

Two-Stage Analysis (Residue-Level Causality)

from bankai import TwoStageAnalyzerGPU, perform_two_stage_analysis_gpu

events = [
    (5000, 10000, 'unfolding'),
    (20000, 25000, 'aggregation')
]

two_stage_result = perform_two_stage_analysis_gpu(
    trajectory, macro_result, events, n_residues=129
)

# Causal network visualization
from bankai.visualization import CausalityVisualizerGPU
viz = CausalityVisualizerGPU()
fig = viz.visualize_residue_causality(
    two_stage_result.residue_analyses['unfolding'],
    interactive=True
)

Architecture

bankai/
├── __init__.py          # Public API, GPU detection, lazy imports
├── __main__.py          # python -m bankai entrypoint
├── cli.py               # CLI (bankai command)
├── models.py            # Result types & data models
├── core/                # GPU kernels, memory management, utilities
│   ├── gpu_kernels.py       # Low-level CUDA kernel wrappers
│   ├── gpu_memory.py        # GPU memory pool & batch management
│   ├── gpu_utils.py         # Array operations, CPU/GPU dispatch
│   └── gpu_patches.py       # CuPy compatibility patches
├── analysis/            # Main analysis engines
│   ├── md_lambda3_detector_gpu.py  # Core Λ³ detector
│   ├── two_stage_analyzer_gpu.py   # Two-stage (macro→residue) pipeline
│   ├── topology_resolver.py        # Atoms Name resolver
│   ├── run_full_analysis.py        # End-to-end pipeline orchestrator
│   ├── third_impact_analytics.py   # Advanced cascade analytics
│   └── maximum_report_generator.py # Comprehensive report generation
├── detection/           # Anomaly & event detection
│   ├── anomaly_detection_gpu.py    # Statistical anomaly detection
│   ├── boundary_detection_gpu.py   # Phase boundary identification
│   ├── extended_detection_gpu.py   # Extended event detection
│   ├── phase_space_gpu.py          # Phase space reconstruction & analysis
│   └── topology_breaks_gpu.py      # Topological break detection
├── residue/             # Residue-level analysis
│   ├── causality_analysis_gpu.py   # Inter-residue causal inference
│   ├── confidence_analysis_gpu.py  # Statistical confidence scoring
│   ├── residue_network_gpu.py      # Residue interaction networks
│   └── residue_structures_gpu.py   # Structural feature extraction
├── structures/          # Structural computation
│   ├── lambda_structures_gpu.py    # Λ-structure tensor computation
│   ├── md_features_gpu.py          # MD feature extraction
│   └── tensor_operations_gpu.py    # Core tensor math
├── quantum/             # Quantum-level validation
│   └── quantum_validation_gpu.py   # Quantum effect detection
├── visualization/       # Plotting & interactive viz
│   ├── plot_results_gpu.py         # Static plots (matplotlib)
│   └── causality_viz_gpu.py        # Causal network viz (plotly)
├── benchmark/           # Performance testing
│   └── performance_tests.py
└── data/                # Sample datasets
    └── chignolin/           # Chignolin mini-protein test data

Performance

End-to-End Pipeline

Data Size CPU Time GPU Time Speedup
1K frames ~10s ~0.5s 20x
10K frames ~120s ~5s 24x
50K frames ~800s ~25s 32x
100K frames ~2000s ~50s 40x

Custom CUDA Kernels (100 atoms × 5,000 frames)

Kernel CPU CuPy (generic) BANKAI Kernel Speedup
Tension field (ρT) 1200s 120s 8s 150x
Topological charge (Q_λ) 800s 80s 4s 200x
Anomaly detection 600s 60s 3s 200x
Phase space analysis 2400s 240s 15s 160x

BANKAI's custom kernels are not CuPy wrappers — they are hand-written CUDA with shared memory tiling, coalesced access patterns, and lock-free atomic reductions. This is what makes 0.01 ps analysis feasible within hours rather than weeks.

Benchmarked on NVIDIA RTX 4070 Ti SUPER

Configuration

Environment Variables

Variable Description
BANKAI_GPU_MEMORY_LIMIT GPU memory limit in GB (e.g., "8.0")
BANKAI_DEBUG Enable debug logging ("1" or "true")
BANKAI_NO_BANNER Suppress CLI banner
BANKAI_BANNER_STYLE CLI banner style (simple, ascii, matrix)

Memory Management

# Set GPU memory limit
import os
os.environ['BANKAI_GPU_MEMORY_LIMIT'] = '8.0'

# Or via detector
detector.memory_manager.set_max_memory(8)
detector.set_batch_size(5000)

# Mixed precision (FP16)
detector.enable_mixed_precision()

Sample Data

BANKAI includes a Chignolin mini-protein dataset for testing:

from bankai.data import load_chignolin, chignolin_available

if chignolin_available():
    data = load_chignolin()
    trajectory = data['trajectory']       # (10001, 166, 3)
    metadata = data['metadata']
    protein_indices = data['protein_indices']

Generate synthetic test data:

from bankai.data import generate_synthetic_chignolin
paths = generate_synthetic_chignolin()

Or via CLI:

bankai example --generate

Troubleshooting

GPU not detected:

from bankai import get_gpu_info
print(get_gpu_info())

Out of memory:

Reduce batch size or disable extended features:

config.gpu_batch_size = 1000
config.use_extended_detection = False
config.use_phase_space = False

NumPy 2.0 compatibility:

BANKAI requires NumPy < 2.0.0. If you see numpy._core errors, downgrade:

pip install "numpy>=1.22.0,<2.0.0"

Pharmaceutical Applications

BANKAI enables atomic-level analysis previously inaccessible to conventional MD tools:

  • Drug-protein interactions — Visualize binding processes at atomic resolution, including transient hydrogen bond formation (10–50 fs) and proton transfer events (20–100 fs)
  • Allosteric pathway mapping — Trace how structural perturbations propagate across residue networks with causal directionality
  • Cryptic binding site discovery — Detect transient pocket openings invisible to ensemble-averaged structures
  • Resistance mutation analysis — Identify how point mutations alter cascade propagation pathways
  • QM/MM candidate screening — Efficiently identify statistically anomalous events (>5σ) that warrant quantum-mechanical investigation

Author

Masamichi Iizumi — CEO, Miosync, Inc.

License

MIT License — see LICENSE for details.


Built with 💕 by Masamichi & Tamaki

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bankai_md-1.1.4.tar.gz (207.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bankai_md-1.1.4-py3-none-any.whl (222.1 kB view details)

Uploaded Python 3

File details

Details for the file bankai_md-1.1.4.tar.gz.

File metadata

  • Download URL: bankai_md-1.1.4.tar.gz
  • Upload date:
  • Size: 207.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bankai_md-1.1.4.tar.gz
Algorithm Hash digest
SHA256 8157e0656767c4f7222eb603ccbc5de061eca7ef501b46250f8b5901ec6b039e
MD5 e7d94023c107acdbb835042ae92e978d
BLAKE2b-256 812b092ac1c48f399d39dcd58d867812657f2ea17b49438068e4a0a0c5fbb02d

See more details on using hashes here.

Provenance

The following attestation bundles were made for bankai_md-1.1.4.tar.gz:

Publisher: python-publish.yml on miosync-masa/bankai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bankai_md-1.1.4-py3-none-any.whl.

File metadata

  • Download URL: bankai_md-1.1.4-py3-none-any.whl
  • Upload date:
  • Size: 222.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bankai_md-1.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ffe38c40ca7457ec72d5e11b251b7a7fd60d3f2b73155285ce7562e907791ced
MD5 e382cd4d72296718550c0f8c3561bf89
BLAKE2b-256 af7fb2a6b07fd5dbe71156e05aba19f388a059d8e17d4d66e9f078b4385e3b23

See more details on using hashes here.

Provenance

The following attestation bundles were made for bankai_md-1.1.4-py3-none-any.whl:

Publisher: python-publish.yml on miosync-masa/bankai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page