Skip to main content

Unified Memory SDK for LLM applications with multi-tier storage, policy-driven lifecycle, and intelligent summarization

Project description

Axon Memory

🧠 Unified Memory SDK for LLM Applications

License Python Version Tests PyPI

📚 Documentation · 🚀 Quick Start · 💡 Examples · 📖 API Reference · 📋 Changelog


🎯 What is Axon?

Axon is a production-ready memory management system for Large Language Model (LLM) applications. It provides intelligent multi-tier storage, policy-driven lifecycle management, and semantic recall with automatic compaction and summarization.

Think of it as a smart caching layer for your LLM's memory—automatically organizing memories by importance, managing token budgets, and ensuring compliance with privacy regulations.

🌟 Key Benefits

  • 💰 Cost Reduction: Intelligent tier routing reduces expensive vector DB operations by 60%
  • ⚡ Performance: Multi-tier caching with sub-millisecond ephemeral access
  • 🔒 Compliance: Built-in PII detection and audit trails for GDPR/HIPAA
  • 🧩 Pluggable: Works with any vector database or embedding provider
  • 🔄 Framework Ready: First-class LangChain and LlamaIndex integration

What is Axon?

Axon is a production-ready memory management system for Large Language Model (LLM) applications. It provides intelligent multi-tier storage, policy-driven lifecycle management, and semantic recall with automatic compaction and summarization.

Think of it as a smart caching layer for your LLM's memory - automatically organizing memories by importance, managing token budgets, and ensuring compliance.

✨ Features

Core Capabilities

  • 🏗️ Multi-Tier Architecture - Automatic routing across ephemeral, session, and persistent tiers
  • 📜 Policy-Driven Lifecycle - Configure TTL, capacity limits, promotion/demotion thresholds
  • 🔍 Semantic Search - Vector-based similarity search with metadata filtering
  • 📦 Automatic Compaction - Summarize and compress memories to manage token budgets
  • 📊 Audit Logging - Complete audit trails for compliance (GDPR, HIPAA)
  • 🔐 PII Detection - Automatic detection and classification of sensitive information
  • 🔄 Transaction Support - Two-phase commit (2PC) for atomic multi-tier operations
  • 📝 Structured Logging - Production-grade JSON logging with correlation IDs
  • 🧩 Framework Integration - First-class support for LangChain and LlamaIndex

Storage Adapters

Adapter Use Case Status
💾 In-Memory Development & Testing ✅ Complete
🔴 Redis Ephemeral Caching ✅ Complete
🎨 ChromaDB Local Vector Storage ✅ Complete
🔷 Qdrant Production Vector DB ✅ Complete
🌲 Pinecone Managed Vector DB ✅ Complete
💿 SQLite File-based Storage 🚧 Planned

Embedding Providers

  • OpenAI - text-embedding-3-small/large
  • Voyage AI - voyage-2, voyage-code-2
  • Sentence Transformers - Local open-source models
  • HuggingFace - Any HuggingFace model
  • Custom - Bring your own embedder

🚀 Quick Start

Installation

pip install axon-memory

Basic Usage

import asyncio
from axon import MemorySystem

async def main():
    # Initialize with balanced configuration
    memory = MemorySystem()
    
    # Store a memory (automatically routed to appropriate tier)
    entry_id = await memory.store(
        "User prefers dark mode and compact layout",
        metadata={"user_id": "user123", "category": "preferences"}
    )
    
    # Semantic search across all tiers
    results = await memory.search("user interface preferences", k=5)
    
    for entry in results:
        print(f"💡 {entry.content}")
        print(f"   Tier: {entry.tier}, Score: {entry.metadata.get('score', 0):.2f}\n")
    
    # Retrieve specific memory
    entry = await memory.get(entry_id)
    print(f"Retrieved: {entry.content}")
    
    # Delete when no longer needed
    await memory.forget(entry_id)

asyncio.run(main())

Architecture

graph TB
    A[Your LLM Application] --> B[MemorySystem API]
    B --> C{Router}
    C -->|importance < 0.3| D[Ephemeral Tier]
    C -->|importance 0.3-0.7| E[Session Tier]
    C -->|importance > 0.7| F[Persistent Tier]

    D --> G[In-Memory / Redis]
    E --> H[Redis / ChromaDB]
    F --> I[Qdrant / Pinecone / ChromaDB]

    J[PolicyEngine] -.->|promotion| C
    J -.->|demotion| C

    style B fill:#4051B5,color:#fff
    style C fill:#5C6BC0,color:#fff

💡 Why Axon?

Challenge Traditional Approach Axon Solution
Token Limits Manual pruning ✅ Automatic compaction & summarization
High Costs All data in vector DB ✅ 60% cost reduction via intelligent routing
Session Management Custom implementation ✅ Built-in TTL & lifecycle policies
PII & Privacy Manual scrubbing ✅ Automatic PII detection (emails, SSN, cards)
Compliance Manual audit logs ✅ GDPR/HIPAA-ready audit trails
Complexity Multiple SDKs ✅ Unified API for all operations

Use Cases

Chatbot with Persistent Memory

from axon.integrations.langchain import AxonChatMemory
from langchain_openai import ChatOpenAI
from langchain.chains import LLMChain

memory = AxonChatMemory(system=MemorySystem(...))
llm = ChatOpenAI(model="gpt-4")
chain = LLMChain(llm=llm, memory=memory)

# Conversations persist across sessions
response = await chain.arun("What did we discuss last week?")

RAG with Multi-Tier Storage

from axon.integrations.llamaindex import AxonVectorStore
from llama_index.core import VectorStoreIndex

vector_store = AxonVectorStore(system=MemorySystem(...))
index = VectorStoreIndex.from_vector_store(vector_store)

query_engine = index.as_query_engine()
response = await query_engine.aquery("Explain quantum computing")

Audit-Compliant Memory

from axon.core import AuditLogger

audit_logger = AuditLogger(max_events=10000, enable_rotation=True)
system = MemorySystem(config=config, audit_logger=audit_logger)

# All operations automatically logged
await system.store("Sensitive data", privacy_level=PrivacyLevel.RESTRICTED)

# Export audit trail
events = await system.export_audit_log(operation=OperationType.STORE)

🎨 Use Cases

1. 💬 Chatbot with Long-Term Memory

from axon.integrations.langchain import AxonChatMemory
from langchain_openai import ChatOpenAI

memory = AxonChatMemory(system=MemorySystem())
llm = ChatOpenAI(model="gpt-4")

# Conversations persist across sessions
# Automatic promotion of important context
response = await llm.ainvoke("What did we discuss about the project timeline?")

2. 📚 RAG with Multi-Tier Storage

from axon.integrations.llamaindex import AxonVectorStore
from llama_index.core import VectorStoreIndex, Document

# Create vector store backed by Axon
vector_store = AxonVectorStore(system=MemorySystem())

# Build index from documents
documents = [Document(text="Quantum computing explanation...")]
index = VectorStoreIndex.from_documents(documents, vector_store=vector_store)

# Query with automatic tier optimization
query_engine = index.as_query_engine()
response = await query_engine.aquery("Explain quantum entanglement")

3. 🔍 Semantic Search with Filters

from axon.models import MemoryFilter, MemoryTier

# Store with metadata
await memory.store(
    "Q4 revenue exceeded projections by 23%",
    metadata={"department": "finance", "year": 2024, "quarter": "Q4"}
)

# Filtered semantic search
filter = MemoryFilter(
    tier=MemoryTier.PERSISTENT,
    metadata={"department": "finance"},
    created_after=datetime(2024, 10, 1)
)

results = await memory.search("financial performance", k=10, filter=filter)

4. 🔒 Compliance-Ready Memory

from axon.core import AuditLogger
from axon.models import PrivacyLevel

# Enable audit logging
audit_logger = AuditLogger(max_events=10000)
memory = MemorySystem(audit_logger=audit_logger)

# Automatic PII detection
await memory.store(
    "Customer email: john@example.com, Phone: 555-1234",
    privacy_level=PrivacyLevel.INTERNAL
)

# Export audit trail for compliance
events = await memory.export_audit_log(
    operation="store",
    start_date=datetime(2024, 1, 1)
)

Core Concepts

Memory Tiers

  • Ephemeral (importance < 0.3): Short-lived, high-volume data
  • Session (0.3 ≤ importance < 0.7): Session-scoped context
  • Persistent (importance ≥ 0.7): Long-term semantic storage

Policies

Define lifecycle rules for each tier:

from axon.core.policies import SessionPolicy

policy = SessionPolicy(
    ttl_minutes=60,           # Session expires after 1 hour
    max_items=100,            # Limit to 100 memories
    summarize_after=50,       # Summarize when reaching 50 items
    promote_threshold=0.8,    # Promote high-importance memories
)

Routing

Automatic tier selection based on:

  1. Importance scores
  2. Access patterns (recency, frequency)
  3. Capacity constraints
  4. Explicit tier hints

Advanced Features

Compaction Strategies

# Count-based compaction
await system.compact(tier="session", strategy="count", threshold=50)

# Semantic similarity compaction
await system.compact(tier="session", strategy="semantic", threshold=0.9)

# Hybrid strategy (combines multiple approaches)
await system.compact(tier="session", strategy="hybrid")

Privacy & PII Detection

# Automatic PII detection enabled by default
entry_id = await system.store("Contact: john@example.com, 555-1234")

# Check detected PII
tier, entry = await system._get_entry_by_id(entry_id)
print(entry.metadata.pii_detection.detected_types)
# Output: {'email', 'phone'}

print(entry.metadata.privacy_level)
# Output: PrivacyLevel.INTERNAL

Transactions (2PC)

from axon.core.transaction import TransactionManager, IsolationLevel

tx_manager = TransactionManager(registry, isolation_level=IsolationLevel.SERIALIZABLE)

async with tx_manager.transaction() as tx:
    await tx.store_in_tier("ephemeral", entry1)
    await tx.store_in_tier("persistent", entry2)
    # Atomic commit across both tiers

📚 Documentation

Section Description Link
🚀 Getting Started Installation, quickstart, configuration View
💡 Core Concepts Tiers, policies, routing, lifecycle View
🔧 Storage Adapters Redis, Qdrant, Pinecone, ChromaDB View
Advanced Features Audit, privacy, transactions, compaction View
🧩 Integrations LangChain, LlamaIndex View
📖 API Reference Complete API documentation View
🚢 Deployment Production setup, monitoring, security View
💻 Examples 25+ working code examples View

Development

Prerequisites

  • Python 3.9+
  • Virtual environment (recommended)

Setup

# Clone repository
git clone https://github.com/saranmahadev/Axon.git
cd Axon

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate

# Install with dev dependencies
pip install -e ".[dev]"

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=axon --cov-report=html

# Run specific test markers
pytest -m unit              # Unit tests only
pytest -m integration       # Integration tests

Code Quality

# Format code
black src/ tests/

# Lint
ruff check src/ tests/

# Type check
mypy src/axon

📁 Repository Structure

AxonMemoryCore/
├── 📂 src/axon/           # Source code
│   ├── core/              # Core memory system
│   ├── adapters/          # Storage adapters (Redis, Qdrant, etc.)
│   ├── embedders/         # Embedding providers
│   ├── models/            # Data models
│   ├── integrations/      # LangChain, LlamaIndex
│   └── utils/             # Utilities
├── 📂 tests/              # Test suite (97.8% passing)
│   ├── unit/              # Unit tests
│   └── integration/       # Integration tests
├── 📂 docs/               # Documentation source
├── 📂 examples/           # 25+ working examples
│   ├── 01-basics/         # Hello world, CRUD operations
│   ├── 02-intermediate/   # Adapters, compaction, filters
│   ├── 03-advanced/       # Transactions, audit, privacy
│   ├── 04-integrations/   # LangChain, LlamaIndex
│   └── 05-real-world/     # Production examples
└── 📄 pyproject.toml      # Project configuration

📊 Project Status

Metric Status
Version 1.0.0-beta2 (Nov 2025)
Test Coverage 97.8% passing (634/646 tests)
Production Ready ⚠️ Beta - 70% complete
License MIT

✅ What's Working

  • ✅ Core memory operations (store, recall, forget, compact)
  • ✅ Multi-tier routing with automatic promotion/demotion
  • ✅ 5 production storage adapters
  • ✅ LangChain and LlamaIndex integrations
  • ✅ Audit logging and PII detection
  • ✅ Transaction support (2PC)
  • ✅ Advanced compaction strategies
  • ✅ Comprehensive documentation

🚧 In Progress

  • 🚧 Performance optimization (caching, connection pooling)
  • 🚧 Security audit
  • 🚧 SQLite adapter

📅 Upcoming (v1.0 Stable - Q1 2025)

  • CLI tools for backup/restore
  • Performance benchmarks
  • Extended monitoring
  • Production hardening

Roadmap

See ROADMAP.md for detailed sprint planning.

v1.0 (Current - Beta):

  • ✅ Core memory system
  • ✅ Multi-tier routing
  • ✅ Storage adapters (5/6 complete)
  • ✅ LangChain/LlamaIndex integrations
  • 🚧 Documentation
  • 🚧 Performance optimization

v1.1 (Planned):

  • SQLite adapter
  • CLI tools for backup/restore
  • Performance benchmarks
  • Extended monitoring

v2.0 (Future):

  • GraphQL API
  • Real-time sync
  • Multi-tenancy support
  • Advanced security features

Contributing

We welcome contributions! See CONTRIBUTING.md for development guidelines.

Development Workflow

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests and linters
  5. Commit (git commit -m 'Add amazing feature')
  6. Push to branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

License

Axon is released under the MIT License.

🆘 Support & Community

Acknowledgments

Built with:


🌟 Star us on GitHub if you find Axon useful!

Made with ❤️ by Saran Mahadev

Documentation GitHub PyPI

Axon Memory • Intelligent Memory Management for LLM Applications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

axon_sdk-1.0.0b2.tar.gz (362.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

axon_sdk-1.0.0b2-py3-none-any.whl (125.9 kB view details)

Uploaded Python 3

File details

Details for the file axon_sdk-1.0.0b2.tar.gz.

File metadata

  • Download URL: axon_sdk-1.0.0b2.tar.gz
  • Upload date:
  • Size: 362.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for axon_sdk-1.0.0b2.tar.gz
Algorithm Hash digest
SHA256 b36162d2f00a8ec0a99f02370f9c72802ca59a7c2f4eb3ceb0734df5576f0115
MD5 b659d3cec73d067dfc6f191964485fcf
BLAKE2b-256 f7ed8957a95ea394c6a6d2f70457fec015ea30d429cd377f4bb1e46b5aafeb83

See more details on using hashes here.

File details

Details for the file axon_sdk-1.0.0b2-py3-none-any.whl.

File metadata

  • Download URL: axon_sdk-1.0.0b2-py3-none-any.whl
  • Upload date:
  • Size: 125.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for axon_sdk-1.0.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 5b1d597091a16a312dc1aed9bc88d21bb53197285510602e2cc746f77ac12148
MD5 7894c75b553fb745acd2b340596ad96d
BLAKE2b-256 debe7cec50c76d4d28cce7ebc40091a376cbbf40be9a2bbe45b5660535242b6c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page