Adaptive Neural Execution Engine – Dynamic sparse inference for pre-trained Transformers.
Project description
ANEE v0.3 — Adaptive Neural Execution Engine
Dynamic Sparse Inference for Pre-Trained Transformers
ANEE is a lightweight framework for token-wise, layer-wise adaptive computation in transformer language models. Instead of running every layer for every token, ANEE learns how to allocate compute dynamically, reducing unnecessary computation while preserving output quality.
ANEE wraps existing HuggingFace models (e.g., GPT-2) without modifying their weights.
🔧 Key Capabilities
• Dynamic Layer Skipping
ANEE evaluates each transformer block at inference time and decides whether to:
- PROCESS — run full attention + MLP
- SKIP — bypass computation for that layer
- EXIT — terminate further processing (supported)
This produces sparse execution patterns that vary across tokens.
• RL-Trained Controller
A small neural controller receives a per-layer state vector containing:
- entropy of logits
- hidden-state norms
- delta-norms
- variance
- layer position
- remaining budget
It learns policies via:
-
Supervised warm-start (from heuristic traces)
-
Reinforcement learning with a reward balancing:
- similarity to full model (KL divergence)
- compute savings
- budget adherence
• Budget-Aware Inference
Users provide an energy_budget in [0,1].
The controller adjusts its behavior per token to meet the budget target while maintaining model output quality.
• Visual Execution Maps
ANEE includes tooling to visualize:
- token-by-layer skip/process patterns
- per-token compute usage
- overall savings
- effective depth profiles
These “execution heatmaps” help interpret which layers the model relies on.
• Model-Agnostic Design
The wrapper manually unrolls transformer layers and is structured for easy adaptation to other decoder-only architectures beyond GPT-2.
📦 Repository Structure
anee/
│
├── wrapper.py # Core dynamic execution engine
├── controller.py # Heuristic + learned controllers
├── profiler.py # Layer-level state feature extractor
├── reward.py # RL reward (quality + efficiency)
├── utils.py # FLOPs proxy utilities
├── config.py # ANEE configuration
│
├── experiments/
│ ├── train_controller.py
│ ├── train_controller_rl.py
│ ├── collect_traces.py
│ ├── 01_sanity_check.py
│ ├── visualize_heatmap.py
🚀 Getting Started
Install
pip install -e .
Warm-start Controller
python experiments/train_controller.py
RL Fine-Tuning
python experiments/train_controller_rl.py
Quick Test
python experiments/01_sanity_check.py
Generate Heatmap Visualization
python experiments/visualize_heatmap.py
📈 Performance Snapshot (GPT-2 Small)
At moderate budgets, ANEE typically:
- executes ~6–9 of 12 layers per token
- achieves ~20–30% effective compute reduction
- maintains coherent generation
- shows consistent “sparse middle, dense edges” execution profiles
Lower budgets naturally trade off output quality.
🔬 Intended Use & Applications
ANEE provides a clean, transparent platform for research in:
- dynamic depth / adaptive inference
- efficient transformer execution
- compute-aware LLM routing
- per-token sparsity patterns
- RL-driven execution policies
It is well-suited for experimentation, teaching, and further development.
📄 License
APACHE 2.0
📚 Citation
@software{ANEE,
author = {Ahmed Bin Khalid},
title = {ANEE: Adaptive Neural Execution Engine},
year = {2025},
note = {Dynamic compute allocation for transformer inference},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anee-0.3.tar.gz.
File metadata
- Download URL: anee-0.3.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9737b07187a1e49de0b0de5b6e0fb10d2710a6bfe58a817c38a3cf07c50517d1
|
|
| MD5 |
9a7fd58c08f7ab46264746c6ae89f2eb
|
|
| BLAKE2b-256 |
7d03c382ef117550b3d79437d0cfb76a124ba79fd90388237027734431f35e26
|
File details
Details for the file anee-0.3-py3-none-any.whl.
File metadata
- Download URL: anee-0.3-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
509b7b152330859c565d7ff04bc2f03322fe13fe4a70312eddadcf9ff350863d
|
|
| MD5 |
2b0a3e465a8e9fab51aa92dff9cde4bb
|
|
| BLAKE2b-256 |
81d3e162650eab787043c27870b8606da9e1b9678d8a812b79b2106eb219542d
|