Skip to main content

An objective way to evaluate neural network architectures

Project description

AI-HEXAGON

โš ๏ธ Early Development: This project is currently in its early development phase and not accepting external architecture submissions yet. Star/watch the repository to be notified when we open for contributions.

๐Ÿ“Š View Live Leaderboard & Results

AI-HEXAGON is an objective benchmarking framework designed to evaluate neural network architectures independently of natural language processing tasks. By isolating architectural capabilities from training techniques and datasets, it enables meaningful and efficient comparisons between different neural network designs.

๐ŸŽฏ Motivation

Traditional neural network benchmarking often conflates architectural performance with training techniques and dataset biases. This makes it challenging to:

  • Isolate true architectural capabilities
  • Iterate quickly on design changes
  • Compare models fairly

AI-HEXAGON solves these challenges by:

  • ๐Ÿ” Pure Architecture Focus: Tests that evaluate only the architecture, removing confounding factors like tokenization and dataset-specific optimizations
  • โšก Rapid Iteration: Enable quick testing of architectural changes without large-scale training
  • ๐Ÿ› ๏ธ Flexible Testing: Support both standard benchmarking and custom test suites

๐ŸŒŸ Key Features

  • ๐Ÿ“Š Pure Architecture Evaluation: Tests fundamental capabilities independently
  • โš–๏ธ Controlled Environment: Fixed parameter budget and raw numerical inputs
  • ๐Ÿ“ Clear Metrics: Six independently measured fundamental capabilities
  • ๐Ÿ” Transparent Implementation: Clean, framework-agnostic code
  • ๐Ÿค– Automated Testing: GitHub Actions for fair, manipulation-proof evaluation
  • ๐Ÿ“ˆ Live Results: Real-time benchmarking results at ai-hexagon.dev

๐Ÿ“ Metrics (The Hexagon)

Each architecture is evaluated on six fundamental capabilities:

Metric Description
๐Ÿง  Memory Capacity Store and recall information from training data
๐Ÿ”„ State Management Maintain and manipulate internal hidden states
๐ŸŽฏ Pattern Recognition Recognize and extrapolate sequences
๐Ÿ“ Position Processing Handle positional information within sequences
๐Ÿ”— Long-Range Dependency Manage dependencies over long sequences
๐Ÿ“ Length Generalization Process sequences longer than training examples

๐Ÿ“ Project Structure

ai-hexagon/
โ”œโ”€โ”€ ai_hexagon/
โ”‚   โ””โ”€โ”€ modules/          # Common neural network modules
โ””โ”€โ”€ results/              # Model implementations and results
    โ”œโ”€โ”€ suite.json        # Default test suite configuration
    โ””โ”€โ”€ transformer/
        โ”œโ”€โ”€ model.py      # Transformer implementation
        โ””โ”€โ”€ modules/      # Custom modules (if needed)

โš™๏ธ Parameter Budget

The default suite enforces a 4MB parameter limit for fair comparisons:

Precision Parameter Limit
Complex64 0.5M params
Float32 1M params
Float16 2M params
Int8 4M params

๐Ÿค Contributing

We welcome contributions once the project is ready for external input. To contribute:

  1. Fork: Create your own fork of the project
  2. Install: Run poetry install (optionally with --with dev,cuda12) to get the ai-hex command
  3. Implement: Add your model in results/your_model_name/
  4. Document: Include comprehensive docstrings and references
  5. Submit: Create a pull request following our guidelines
  6. Wait: CI will automatically evaluate your model and update the leaderboard

Use ai-hex tests list to see available tests, ai-hex tests show test_name to view test schema, and ai-hex suite run ./path/to/model.py to run your model against the suite.

๐Ÿ”ง Technical Stack: JAX and Flax

We chose JAX and Flax for their:

  • ๐Ÿงฉ Functional Design: Clear architecture definitions with immutable state
  • โšก Custom Operations: Comprehensive support through jax.numpy
  • ๐ŸŽฏ Reproducibility: First-class random number handling

๐Ÿ“ Code Style: Using einops

We mandate einops for complex tensor operations to enhance readability. Compare:

# Traditional approach - hard to understand the transformation
x = x.reshape(batch, x.shape[1], x.shape[-2]*2, x.shape[-1]//2)
x = x.transpose(0, 2, 1, 3)

# Using einops - crystal clear intent
x = rearrange(x, 'b t (h d) c -> b (h t) (d c)')

๐Ÿ“– Example Model Implementation

import flax.linen as nn
from einops import rearrange

class Transformer(nn.Module):
    """
    Transformer Decoder Stack architecture from 'Attention Is All You Need'.
    Reference: https://arxiv.org/abs/1706.03762
    """
    hidden_dim: int = 256
    num_layers: int = 4
    num_heads: int = 4

    @nn.compact
    def __call__(self, x):
        # Architecture implementation
        return x

๐Ÿ” Test Suite Configuration

Test suites use a JSON configuration format:

{
    "name": "General 1M",
    "description": "General architecture performance evaluation",
    "metrics": [
        {
            "name": "Memory Capacity",
            "description": "Information storage and recall capability",
            "tests": [
                {
                    "weight": 1.0,
                    "test": {
                        "name": "hash_map",
                        "seed": 0,
                        "key_length": 8,
                        "value_length": 64,
                        "num_pairs_range": [32, 65536],
                        "vocab_size": 1024
                    }
                }
            ]
        }
    ]
}

๐Ÿ“ˆ Results are automatically generated via GitHub Actions to ensure fairness. The leaderboard is updated in real-time at ai-hexagon.dev.

๐Ÿ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_hexagon-0.1.0.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

ai_hexagon-0.1.0-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file ai_hexagon-0.1.0.tar.gz.

File metadata

  • Download URL: ai_hexagon-0.1.0.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.11.6

File hashes

Hashes for ai_hexagon-0.1.0.tar.gz
Algorithm Hash digest
SHA256 929037d4a76131f2a726310f10b29b92dc469cd4e7141dab74412f96a0a24608
MD5 b4fd732f8b96afe75143ce3913b1f4dd
BLAKE2b-256 46d68f6e396b5fadec514ff9f547f19f506d7ec8e2b47c5378d97ffe0ede268f

See more details on using hashes here.

File details

Details for the file ai_hexagon-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ai_hexagon-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.11.6

File hashes

Hashes for ai_hexagon-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b4034297565e7bd4afa527bccd9100f9ab95e8b4a4cbaf9a84332d4f2a64fc2
MD5 af92e6e08789a5ef12c5f2af7cdbbe1f
BLAKE2b-256 33ae98e44ad386dc17cb32280cccf4eef507e09838cbb82c52d83db75b97a52e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page