Hierarchical Attention Tree: 100% recall at 70x faster build times than HNSW. A new database paradigm for AI memory and hierarchical semantic search.

These details have not been verified by PyPI

Project links

Project description

HAT: Hierarchical Attention Tree

A novel index structure for AI memory systems that achieves 100% recall at 70x faster build times than HNSW.

Also: A new database paradigm for any domain with known hierarchy + semantic similarity.

Architecture

HAT Architecture

HAT exploits the known hierarchy in AI conversations: sessions contain documents, documents contain chunks. This structural prior enables O(log n) queries with 100% recall.

Key Results

Summary Results

Metric	HAT	HNSW	Improvement
Recall@10	100%	70%	+30%
Build Time	30ms	2.1s	70x faster
Query Latency	3.1ms	-	Production-ready

Benchmarked on hierarchically-structured AI conversation data

Recall Comparison

HAT vs HNSW Recall

HAT achieves 100% recall where HNSW achieves only ~70% on hierarchically-structured data.

Build Time

Build Time Comparison

HAT builds indexes 70x faster than HNSW - critical for real-time applications.

The Problem

Large language models have finite context windows. A 10K context model can only "see" the most recent 10K tokens, losing access to earlier conversation history.

Current solutions fall short:

Longer context models: Expensive to train and run
Summarization: Lossy compression that discards detail
RAG retrieval: Re-embeds and recomputes attention every query

The HAT Solution

HAT vs RAG

HAT exploits known structure in AI workloads. Unlike general vector databases that treat data as unstructured point clouds, AI conversations have inherent hierarchy:

Session (conversation boundary)
  └── Document (topic or turn)
       └── Chunk (individual message)

The Hippocampus Analogy

Hippocampus Analogy

HAT mirrors human memory architecture - functioning as an artificial hippocampus for AI systems.

How It Works

Beam Search Query

Beam Search

HAT uses beam search through the hierarchy:

1. Start at root
2. At each level, score children by cosine similarity to query
3. Keep top-b candidates (beam width)
4. Return top-k from leaf level

Complexity: O(b · d · c) = O(log n) when balanced

Consolidation Phases

Inspired by sleep-staged memory consolidation, HAT maintains index quality through incremental consolidation.

Scale Performance

HAT maintains 100% recall across all tested scales while HNSW degrades significantly.

Scale	HAT Build	HNSW Build	HAT R@10	HNSW R@10
500	16ms	1.0s	100%	55%
1000	25ms	2.0s	100%	44.5%
2000	50ms	4.3s	100%	67.5%
5000	127ms	11.9s	100%	55%

End-to-End Pipeline

Integration Pipeline

Core Claim

A 10K context model with HAT achieves 100% recall on 60K+ tokens with 3.1ms latency.

Messages	Tokens	Context %	Recall	Latency	Memory
1000	30K	33%	100%	1.7ms	1.6MB
2000	60K	17%	100%	3.1ms	3.3MB

Quick Start

Python

from arms_hat import HatIndex

# Create index (1536 dimensions for OpenAI embeddings)
index = HatIndex.cosine(1536)

# Add messages with automatic hierarchy
index.add(embedding)  # Returns ID

# Session/document management
index.new_session()   # Start new conversation
index.new_document()  # Start new topic

# Query
results = index.near(query_embedding, k=10)
for result in results:
    print(f"ID: {result.id}, Score: {result.score:.4f}")

# Persistence
index.save("memory.hat")
loaded = HatIndex.load("memory.hat")

Rust

use hat::{HatIndex, HatConfig};

// Create index
let config = HatConfig::default();
let mut index = HatIndex::new(config, 1536);

// Add points
let id = index.add(&embedding);

// Query
let results = index.search(&query, 10);

Installation

Python

pip install arms-hat

From Source (Rust)

git clone https://github.com/automate-capture/hat.git
cd hat
cargo build --release

Python Development

cd python
pip install maturin
maturin develop

Project Structure

hat/
├── src/                  # Rust implementation
│   ├── lib.rs           # Library entry point
│   ├── index.rs         # HatIndex implementation
│   ├── container.rs     # Tree node types
│   ├── consolidation.rs # Background maintenance
│   └── persistence.rs   # Save/load functionality
├── python/              # Python bindings (PyO3)
│   └── arms_hat/        # Python package
├── benchmarks/          # Performance comparisons
├── examples/            # Usage examples
├── paper/               # Research paper (PDF)
├── images/              # Figures and diagrams
└── tests/               # Test suite

Reproducing Results

# Run HAT vs HNSW benchmark
cargo test --test phase31_hat_vs_hnsw -- --nocapture

# Run real embedding dimension tests
cargo test --test phase32_real_embeddings -- --nocapture

# Run persistence tests
cargo test --test phase33_persistence -- --nocapture

# Run end-to-end LLM demo
python examples/demo_hat_memory.py

When to Use HAT

HAT is ideal for:

AI conversation memory (chatbots, agents)
Session-based retrieval systems
Any hierarchically-structured vector data
Systems requiring deterministic behavior
Cold-start scenarios (no training needed)

Use HNSW instead for:

Unstructured point clouds (random embeddings)
Static knowledge bases (handbooks, catalogs)
When approximate recall is acceptable

Beyond AI Memory: A New Database Paradigm

HAT represents a fundamentally new approach to indexing: exploiting known structure rather than learning it.

Database Type	Structure	Semantics
Relational	Explicit (foreign keys)	None
Document	Implicit (nesting)	None
Vector (HNSW)	Learned from data	Yes
HAT	Explicit + exploited	Yes

Traditional vector databases treat embeddings as unstructured point clouds, spending compute to discover topology. HAT inverts this: known hierarchy is free information - use it.

General Applications

Any domain with hierarchical structure + semantic similarity benefits from HAT:

Legal/Medical Documents: Case → Filing → Paragraph → Sentence
Code Search: Repository → Module → Function → Line
IoT/Sensor Networks: Facility → Zone → Device → Reading
E-commerce: Catalog → Category → Product → Variant
Research Corpora: Journal → Paper → Section → Citation

The Core Insight

"Position IS relationship. No foreign keys needed - proximity defines connection."

HAT combines the structural guarantees of document databases with the semantic power of vector search, without the computational overhead of learning topology from scratch.

Citation

@article{hat2026,
  title={Hierarchical Attention Tree: Extending LLM Context Through Structural Memory},
  author={Young, Lucas and Automate Capture Research},
  year={2026},
  url={https://research.automate-capture.com/hat}
}

Paper

📄 Read the Full Paper (PDF)

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

arms_hat-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl (283.9 kB view details)

Uploaded Jan 11, 2026 CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file arms_hat-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

Download URL: arms_hat-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
Upload date: Jan 11, 2026
Size: 283.9 kB
Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.11.5

File hashes

Hashes for arms_hat-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`8e6cbe2d9dc0878387f21033257c69a5aef53eadf5474f487b85a51e0cb56edd`
MD5	`f29769c0260fcec7b7e37c8852f49859`
BLAKE2b-256	`6dd4ad7d66a5cc282f078388e5e1fbf88bfd5eec920f44fdfb114f960070fd8e`

See more details on using hashes here.

arms-hat 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HAT: Hierarchical Attention Tree

Architecture

Key Results

Recall Comparison

Build Time

The Problem

The HAT Solution

The Hippocampus Analogy

How It Works

Beam Search Query

Consolidation Phases

Scale Performance

End-to-End Pipeline

Core Claim

Quick Start

Python

Rust

Installation

Python

From Source (Rust)

Python Development

Project Structure

Reproducing Results

When to Use HAT

Beyond AI Memory: A New Database Paradigm

General Applications

The Core Insight

Citation

Paper

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes