Skip to main content

The Universal Context Compiler for AI Agent Memory

Project description

Auralith Logo

Aura: The Universal Context Compiler

Compile any document into AI-ready knowledge bases with built-in agent memory.

PyPI version License Python 3.8+ Platform

Quick StartAgent MemoryIntegrationsRAG SupportWebsite


Context is the new Compute.

Aura compiles messy, real-world files (PDFs, DOCX, HTML, code, spreadsheets — 60+ formats) into a single optimized binary (.aura) ready for RAG retrieval and AI agent memory.

One command. No JSONL scripting. No parsing pipelines.

pip install auralith-aura
aura compile ./my_data/ --output knowledge.aura

⚡ Quick Start

1. Install

pip install auralith-aura

# For full document support (PDFs, DOCX, etc.)
pip install 'aura-core[all]'

2. Compile

# Basic compilation
aura compile ./company_data/ --output knowledge.aura

# With PII masking (emails, phones, SSNs automatically redacted)
aura compile ./data/ --output knowledge.aura --pii-mask

# Filter low-quality content
aura compile ./data/ --output knowledge.aura --min-quality 0.3

3. Use

For RAG (Knowledge Retrieval):

from aura.rag import AuraRAGLoader

loader = AuraRAGLoader("knowledge.aura")
text = loader.get_text_by_id("doc_001")

# Framework wrappers
langchain_docs = loader.to_langchain_documents()
llama_docs = loader.to_llama_index_documents()

For Agent Memory:

from aura.memory import AuraMemoryOS

memory = AuraMemoryOS()

# Write to memory tiers
memory.write("fact", "User prefers dark mode", source="agent")
memory.write("episodic", "Discussed deployment strategy")
memory.write("pad", "TODO: check auth module")

# Search memory
results = memory.query("user preferences")

# End session (flushes to durable shards)
memory.end_session()

🧠 Agent Memory

Aura includes a 3-Tier Memory OS — a persistent memory architecture for AI agents:

Tier Purpose Lifecycle
/pad Working notes, scratch space Transient
/episodic Session transcripts, conversation history Auto-archived
/fact Verified facts, user preferences Persistent

The Memory OS is included free when you install from PyPI (pip install auralith-aura).

# CLI memory management
aura memory list       # View all memory shards
aura memory usage      # Storage usage by tier
aura memory prune --before 2026-01-01  # Clean up old memories

🤖 Agent Integrations

Aura works natively with the major AI agent platforms:

Platform Repo Use Case
OpenClaw aura-openclaw Persistent RAG + memory for always-on agents
Claude Code aura-claude-code Context-aware coding with /aura commands
OpenAI Codex aura-codex Knowledge-backed Codex agents
Gemini CLI aura-gemini-cli Gemini CLI extension for RAG

How It Works (Agent RAG Flow)

You: "Learn everything in my /docs/ folder"
  → Agent runs: aura compile ./docs/ --output knowledge.aura
  → Agent loads: AuraRAGLoader("knowledge.aura")
  → You: "What does the auth module do?"
  → Agent queries the .aura file and responds with cited answers

🌟 Key Features

Feature Description
Universal Ingestion Parses 60+ formats: PDF, DOCX, HTML, MD, CSV, code, and more
Agent Memory OS 3-tier memory (pad/episodic/fact) with instant writes
PII Masking Automatically redacts emails, phones, SSNs before compilation
Instant RAG Query any document by keyword or ID. LangChain + LlamaIndex wrappers
Quality Filtering Skip low-quality content with configurable thresholds
Cross-Platform macOS, Windows, and Linux
Secure by Design No pickle. No arbitrary code execution. Safe to share.

📁 Supported File Formats

Documents - PDF, DOCX, HTML, and more
  • .pdf, .docx, .doc, .rtf, .odt, .epub, .txt, .pages, .wpd
  • .html, .htm, .xml
  • .eml, .msg (emails)
  • .pptx, .ppt (presentations)
Data - Spreadsheets and structured data
  • .csv, .tsv
  • .xlsx, .xls
  • .parquet
  • .json, .jsonl
  • .yaml, .yml, .toml
Code - All major programming languages
  • Python: .py, .pyi, .ipynb
  • Web: .js, .ts, .jsx, .tsx, .css
  • Systems: .c, .cpp, .h, .hpp, .rs, .go, .java, .kt, .swift
  • Scripts: .sh, .bash, .zsh, .ps1, .bat
  • Backend: .sql, .php, .rb, .cs, .scala
  • Config: .ini, .cfg, .conf, .env, .dockerfile
Markup - Documentation formats
  • .md (Markdown)
  • .rst (reStructuredText)
  • .tex, .latex

🔧 CLI Reference

aura compile <input_directory> --output <file.aura> [options]

Options:
  --pii-mask           Mask PII (emails, phones, SSNs)
  --min-quality SCORE  Filter low-quality content (0.0-1.0)
  --domain DOMAIN      Tag with domain context
  --no-recursive       Don't search subdirectories
  --verbose, -v        Verbose output

Memory Management

aura memory list                        # List all memory shards
aura memory usage                       # Show storage by tier
aura memory prune --before 2026-01-01   # Remove old shards
aura memory prune --id <shard_id>       # Remove specific shard

Inspect an Archive

aura info knowledge.aura

📦 Aura Archive: knowledge.aura
   Datapoints: 1,234
   
   Sample datapoint:
     Tensors: ['raw_text']
     Source:  legal/contract_001.pdf

🔌 RAG Support

from aura.rag import AuraRAGLoader

loader = AuraRAGLoader("knowledge.aura")

# Text retrieval
text = loader.get_text_by_id("doc_001")

# Filter documents
pdf_docs = loader.filter_by_extension(".pdf")
legal_docs = loader.filter_by_source("legal/")

# Framework wrappers
langchain_docs = loader.to_langchain_documents()  # LangChain
llama_docs = loader.to_llama_index_documents()     # LlamaIndex
dict_list = loader.to_dict_list()                  # Universal

# Statistics
stats = loader.get_stats()

📐 File Format Specification

The .aura format is a secure, indexed binary archive:

[Datapoint 1][Datapoint 2]...[Datapoint N][Index][Footer]

Each Datapoint:
  [meta_length: 4 bytes, uint32]
  [tensor_length: 4 bytes, uint32]
  [metadata: msgpack bytes]
  [tensors: safetensors bytes]

Footer:
  [index_offset: 8 bytes, uint64]
  [magic: 4 bytes, 'AURA']

Security: Uses safetensors (not pickle) — safe to load untrusted files.


💻 Runs Locally

Aura compiles entirely on your local machine — no cloud uploads, no external APIs, no telemetry.

  • Runs on your local hardware — any modern laptop or desktop, your setup, your choice
  • Fully offline — zero internet required after install
  • Cross-platform — macOS, Windows, Linux
  • Python 3.8+

Your documents never leave your hardware.


🚀 Scale Up with OMNI

Aura handles local compilation. For enterprise-scale training pipelines, model fine-tuning, and production-grade agent infrastructure — there's OMNI.

  • Cloud-scale data compilation & training pipelines
  • Supervised model fine-tuning with emphasis weighting
  • Production agent memory infrastructure
  • Team collaboration & enterprise compliance

Explore OMNI →


📜 License


🔗 Links


Made with ❤️ by Auralith Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auralith_aura-0.2.1.tar.gz (45.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auralith_aura-0.2.1-py3-none-any.whl (48.1 kB view details)

Uploaded Python 3

File details

Details for the file auralith_aura-0.2.1.tar.gz.

File metadata

  • Download URL: auralith_aura-0.2.1.tar.gz
  • Upload date:
  • Size: 45.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for auralith_aura-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c64ba027105c68ddfd3c390c32750e6c17ced71352c54ea0c6d1a9ad5050038e
MD5 e92d74d5eeec27fe4c726197857cbc42
BLAKE2b-256 4f3affce33a377fba34f8598133135c06eeb84848b639cd418b44e92d3a7ab69

See more details on using hashes here.

File details

Details for the file auralith_aura-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: auralith_aura-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 48.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for auralith_aura-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b68bb57aa43b9e74eb9a1ddd9ba38044eb33d8a42a5a0d3625663f012f6c4225
MD5 2545b77e8b95060ab4684a1d295ec775
BLAKE2b-256 af80c3e8e7e87f7e5b6e947c6ee7d31abe84f53b08ae4945d3850925a51c7fb8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page