Skip to main content

The Universal Context Compiler for AI Agent Memory

Project description

Rta Labs Logo

Aura: The Universal Context Compiler

Compile any document into AI-ready knowledge bases with built-in agent memory.

PyPI version License Python 3.8+ Platform

Quick StartAgent MemoryIntegrationsRAG SupportWebsite


Context is the new Compute.

Aura compiles messy, real-world files (PDFs, DOCX, HTML, code, spreadsheets — 60+ formats) into a single optimized binary (.aura) ready for RAG retrieval and AI agent memory.

One command. No JSONL scripting. No parsing pipelines.

pip install auralith-aura
aura compile ./my_data/ --output knowledge.aura

⚡ Quick Start

1. Install

pip install auralith-aura

# For full document support (PDFs, DOCX, etc.)
pip install 'aura-core[all]'

2. Compile

# Basic compilation
aura compile ./company_data/ --output knowledge.aura

# With PII masking (emails, phones, SSNs automatically redacted)
aura compile ./data/ --output knowledge.aura --pii-mask

# Filter low-quality content
aura compile ./data/ --output knowledge.aura --min-quality 0.3

3. Use

For RAG (Knowledge Retrieval):

from aura.rag import AuraRAGLoader

loader = AuraRAGLoader("knowledge.aura")
text = loader.get_text_by_id("doc_001")

# Framework wrappers
langchain_docs = loader.to_langchain_documents()
llama_docs = loader.to_llama_index_documents()

For Agent Memory:

from aura.memory import AuraMemoryOS

memory = AuraMemoryOS()

# Write to memory tiers
memory.write("fact", "User prefers dark mode", source="agent")
memory.write("episodic", "Discussed deployment strategy")
memory.write("pad", "TODO: check auth module")

# Search memory
results = memory.query("user preferences")

# End session (flushes to durable shards)
memory.end_session()

🧠 Agent Memory

Aura includes a 3-Tier Memory OS — a persistent memory architecture for AI agents:

Tier Purpose Lifecycle
/pad Working notes, scratch space Transient
/episodic Session transcripts, conversation history Auto-archived
/fact Verified facts, user preferences Persistent

The Memory OS is included free when you install from PyPI (pip install auralith-aura).

# CLI memory management
aura memory list       # View all memory shards
aura memory usage      # Storage usage by tier
aura memory prune --before 2026-01-01  # Clean up old memories

v2.1 Performance Enhancements

Memory OS v2.1 (auralith-aura>=0.2.2) adds six performance enhancements designed for zero RAM overhead — no embedding models, no vector databases, no background services:

Enhancement What It Does
Temporal Decay Recent memories rank higher (14-day half-life recency boost)
Noise Filtering Blocks meta-questions and agent denials from storage and search
Entry Dedup SHA-256 + SimHash near-duplicate detection prevents redundant writes
Bloom Filters ~1KB per shard — skips irrelevant shards entirely during query
SimHash 64-bit locality-sensitive hashing for fuzzy text matching without embeddings
Tiered Scoring Facts rank above episodic, episodic above pad in search results

Upgrade: pip install --upgrade auralith-aura

Data Provenance & Trust

Every memory entry stores explicit metadata — you always know what's in memory and where it came from:

Field What It Tells You
source Who wrote it — agent, user, or system
namespace Which tier — pad, episodic, or fact
timestamp Exact ISO 8601 time of the write
session_id Which session created it
entry_id Unique content hash for traceability

Nothing is inferred or synthesized. Memory contains only what was explicitly written via write(). No hidden embeddings, no derived data, no background processing.

Full user control over memory:

memory.show_usage()                              # Inspect what's stored per tier
memory.query("topic")                            # See exactly what's in memory
memory.prune_shards(before_date="2026-01-01")    # Prune by date
memory.prune_shards(shard_ids=["specific_id"])   # Delete specific shards
# Or delete ~/.aura/memory/ to wipe everything

🤖 Agent Integrations

Aura works natively with the major AI agent platforms:

Platform Repo Use Case
OpenClaw aura-openclaw Persistent RAG + memory for always-on agents
Claude Code aura-claude-code Context-aware coding with /aura commands
OpenAI Codex aura-codex Knowledge-backed Codex agents
Gemini CLI aura-gemini-cli Gemini CLI extension for RAG

How It Works (Agent RAG Flow)

You: "Learn everything in my /docs/ folder"
  → Agent runs: aura compile ./docs/ --output knowledge.aura
  → Agent loads: AuraRAGLoader("knowledge.aura")
  → You: "What does the auth module do?"
  → Agent queries the .aura file and responds with cited answers

🌟 Key Features

Feature Description
Universal Ingestion Parses 60+ formats: PDF, DOCX, HTML, MD, CSV, code, and more
Agent Memory OS 3-tier memory (pad/episodic/fact) with instant writes
PII Masking Automatically redacts emails, phones, SSNs before compilation
Instant RAG Query any document by keyword or ID. LangChain + LlamaIndex wrappers
Quality Filtering Skip low-quality content with configurable thresholds
Cross-Platform macOS, Windows, and Linux
Secure by Design No pickle. No arbitrary code execution. Safe to share.

📁 Supported File Formats

Documents - PDF, DOCX, HTML, and more
  • .pdf, .docx, .doc, .rtf, .odt, .epub, .txt, .pages, .wpd
  • .html, .htm, .xml
  • .eml, .msg (emails)
  • .pptx, .ppt (presentations)
Data - Spreadsheets and structured data
  • .csv, .tsv
  • .xlsx, .xls
  • .parquet
  • .json, .jsonl
  • .yaml, .yml, .toml
Code - All major programming languages
  • Python: .py, .pyi, .ipynb
  • Web: .js, .ts, .jsx, .tsx, .css
  • Systems: .c, .cpp, .h, .hpp, .rs, .go, .java, .kt, .swift
  • Scripts: .sh, .bash, .zsh, .ps1, .bat
  • Backend: .sql, .php, .rb, .cs, .scala
  • Config: .ini, .cfg, .conf, .env, .dockerfile
Markup - Documentation formats
  • .md (Markdown)
  • .rst (reStructuredText)
  • .tex, .latex

🔧 CLI Reference

aura compile <input_directory> --output <file.aura> [options]

Options:
  --pii-mask           Mask PII (emails, phones, SSNs)
  --min-quality SCORE  Filter low-quality content (0.0-1.0)
  --domain DOMAIN      Tag with domain context
  --no-recursive       Don't search subdirectories
  --verbose, -v        Verbose output

Memory Management

aura memory list                        # List all memory shards
aura memory usage                       # Show storage by tier
aura memory prune --before 2026-01-01   # Remove old shards
aura memory prune --id <shard_id>       # Remove specific shard

Inspect an Archive

aura info knowledge.aura

📦 Aura Archive: knowledge.aura
   Datapoints: 1,234
   
   Sample datapoint:
     Tensors: ['raw_text']
     Source:  legal/contract_001.pdf

🔌 RAG Support

from aura.rag import AuraRAGLoader

loader = AuraRAGLoader("knowledge.aura")

# Text retrieval
text = loader.get_text_by_id("doc_001")

# Filter documents
pdf_docs = loader.filter_by_extension(".pdf")
legal_docs = loader.filter_by_source("legal/")

# Framework wrappers
langchain_docs = loader.to_langchain_documents()  # LangChain
llama_docs = loader.to_llama_index_documents()     # LlamaIndex
dict_list = loader.to_dict_list()                  # Universal

# Statistics
stats = loader.get_stats()

📐 File Format Specification

The .aura format is a secure, indexed binary archive:

[Datapoint 1][Datapoint 2]...[Datapoint N][Index][Footer]

Each Datapoint:
  [meta_length: 4 bytes, uint32]
  [tensor_length: 4 bytes, uint32]
  [metadata: msgpack bytes]
  [tensors: safetensors bytes]

Footer:
  [index_offset: 8 bytes, uint64]
  [magic: 4 bytes, 'AURA']

Security: Uses safetensors (not pickle) — safe to load untrusted files.


💻 Runs Locally

Aura compiles entirely on your local machine — no cloud uploads, no external APIs, no telemetry.

  • Runs on your local hardware — any modern laptop or desktop, your setup, your choice
  • Fully offline — zero internet required after install
  • Cross-platform — macOS, Windows, Linux
  • Python 3.8+

Your documents never leave your hardware.


🚀 Scale Up with OMNI

Aura handles local compilation. For enterprise-scale training pipelines, model fine-tuning, and production-grade agent infrastructure — there's OMNI.

  • Cloud-scale data compilation & training pipelines
  • Supervised model fine-tuning with emphasis weighting
  • Production agent memory infrastructure
  • Team collaboration & enterprise compliance

Explore OMNI →


📜 License


🔗 Links


Made with ❤️ by Rta Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auralith_aura-0.2.3.tar.gz (52.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auralith_aura-0.2.3-py3-none-any.whl (54.5 kB view details)

Uploaded Python 3

File details

Details for the file auralith_aura-0.2.3.tar.gz.

File metadata

  • Download URL: auralith_aura-0.2.3.tar.gz
  • Upload date:
  • Size: 52.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for auralith_aura-0.2.3.tar.gz
Algorithm Hash digest
SHA256 3765663884573eb31d5f3f894a68dea0a761bf0cde670d1d67915b8f2dbe5519
MD5 5bc121c7797aa3e32ffb670eade96a59
BLAKE2b-256 00bf30294f080b39244554563298494b6c947368a2825b09038875a668562824

See more details on using hashes here.

File details

Details for the file auralith_aura-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: auralith_aura-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 54.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for auralith_aura-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d115d5e0d567c8596fc956bf2530580b6e943b5c6ee57a300928f3b579eb38d6
MD5 be26cac2e7dc16e98c1c6e8c595e76a0
BLAKE2b-256 08c68b7cb23717a485b61ced2145487c8f8fca3819c6cfbb5f3f7b6132a4e4bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page