Skip to main content

The Universal Context Compiler for AI Agent Memory

Project description

Auralith Logo

Aura: The Universal Context Compiler

Compile any document into AI-ready knowledge bases with built-in agent memory.

PyPI version License: Apache 2.0 Python 3.8+ Platform

Quick StartAgent MemoryIntegrationsRAG SupportWebsite


Context is the new Compute.

Aura compiles messy, real-world files (PDFs, DOCX, HTML, code, spreadsheets — 60+ formats) into a single optimized binary (.aura) ready for RAG retrieval and AI agent memory.

One command. No JSONL scripting. No parsing pipelines.

pip install auralith-aura
aura compile ./my_data/ --output knowledge.aura

⚡ Quick Start

1. Install

pip install auralith-aura

# For full document support (PDFs, DOCX, etc.)
pip install 'aura-core[all]'

2. Compile

# Basic compilation
aura compile ./company_data/ --output knowledge.aura

# With PII masking (emails, phones, SSNs automatically redacted)
aura compile ./data/ --output knowledge.aura --pii-mask

# Filter low-quality content
aura compile ./data/ --output knowledge.aura --min-quality 0.3

3. Use

For RAG (Knowledge Retrieval):

from aura.rag import AuraRAGLoader

loader = AuraRAGLoader("knowledge.aura")
text = loader.get_text_by_id("doc_001")

# Framework wrappers
langchain_docs = loader.to_langchain_documents()
llama_docs = loader.to_llama_index_documents()

For Agent Memory:

from aura.memory import AuraMemoryOS

memory = AuraMemoryOS()

# Write to memory tiers
memory.write("fact", "User prefers dark mode", source="agent")
memory.write("episodic", "Discussed deployment strategy")
memory.write("pad", "TODO: check auth module")

# Search memory
results = memory.query("user preferences")

# End session (flushes to durable shards)
memory.end_session()

🧠 Agent Memory

Aura includes a 3-Tier Memory OS with a Two-Speed Write-Ahead Log:

Tier Purpose Lifecycle
/pad Working notes, scratch space Transient
/episodic Session transcripts, conversation history Auto-archived
/fact Verified facts, user preferences Persistent

Two-Speed WAL:

  • Speed 1 (~0.001s): Instant JSONL append — agents are never blocked
  • Speed 2 (background): Compiles to durable .aura shards at session end
# CLI memory management
aura memory list       # View all memory shards
aura memory usage      # Storage usage by tier
aura memory prune --before 2026-01-01  # Clean up old memories

🤖 Agent Integrations

Aura works natively with the major AI agent platforms:

Platform Repo Use Case
OpenClaw aura-openclaw Persistent RAG + memory for always-on agents
Claude Code aura-claude-code Context-aware coding with /aura commands
OpenAI Codex aura-codex Knowledge-backed Codex agents
Gemini CLI aura-gemini-cli Gemini CLI extension for RAG

How It Works (Agent RAG Flow)

You: "Learn everything in my /docs/ folder"
  → Agent runs: aura compile ./docs/ --output knowledge.aura
  → Agent loads: AuraRAGLoader("knowledge.aura")
  → You: "What does the auth module do?"
  → Agent queries the .aura file and responds with cited answers

🌟 Key Features

Feature Description
Universal Ingestion Parses 60+ formats: PDF, DOCX, HTML, MD, CSV, code, and more
Agent Memory OS 3-tier memory (pad/episodic/fact) with instant writes
PII Masking Automatically redacts emails, phones, SSNs before compilation
Instant RAG Query any document by keyword or ID. LangChain + LlamaIndex wrappers
Quality Filtering Skip low-quality content with configurable thresholds
Cross-Platform macOS, Windows, and Linux
Secure by Design No pickle. No arbitrary code execution. Safe to share.

📁 Supported File Formats

Documents - PDF, DOCX, HTML, and more
  • .pdf, .docx, .doc, .rtf, .odt, .epub, .txt, .pages, .wpd
  • .html, .htm, .xml
  • .eml, .msg (emails)
  • .pptx, .ppt (presentations)
Data - Spreadsheets and structured data
  • .csv, .tsv
  • .xlsx, .xls
  • .parquet
  • .json, .jsonl
  • .yaml, .yml, .toml
Code - All major programming languages
  • Python: .py, .pyi, .ipynb
  • Web: .js, .ts, .jsx, .tsx, .css
  • Systems: .c, .cpp, .h, .hpp, .rs, .go, .java, .kt, .swift
  • Scripts: .sh, .bash, .zsh, .ps1, .bat
  • Backend: .sql, .php, .rb, .cs, .scala
  • Config: .ini, .cfg, .conf, .env, .dockerfile
Markup - Documentation formats
  • .md (Markdown)
  • .rst (reStructuredText)
  • .tex, .latex

🔧 CLI Reference

aura compile <input_directory> --output <file.aura> [options]

Options:
  --pii-mask           Mask PII (emails, phones, SSNs)
  --min-quality SCORE  Filter low-quality content (0.0-1.0)
  --domain DOMAIN      Tag with domain context
  --no-recursive       Don't search subdirectories
  --verbose, -v        Verbose output

Memory Management

aura memory list                        # List all memory shards
aura memory usage                       # Show storage by tier
aura memory prune --before 2026-01-01   # Remove old shards
aura memory prune --id <shard_id>       # Remove specific shard

Inspect an Archive

aura info knowledge.aura

📦 Aura Archive: knowledge.aura
   Datapoints: 1,234
   
   Sample datapoint:
     Tensors: ['raw_text']
     Source:  legal/contract_001.pdf

🔌 RAG Support

from aura.rag import AuraRAGLoader

loader = AuraRAGLoader("knowledge.aura")

# Text retrieval
text = loader.get_text_by_id("doc_001")

# Filter documents
pdf_docs = loader.filter_by_extension(".pdf")
legal_docs = loader.filter_by_source("legal/")

# Framework wrappers
langchain_docs = loader.to_langchain_documents()  # LangChain
llama_docs = loader.to_llama_index_documents()     # LlamaIndex
dict_list = loader.to_dict_list()                  # Universal

# Statistics
stats = loader.get_stats()

📐 File Format Specification

The .aura format is a secure, indexed binary archive:

[Datapoint 1][Datapoint 2]...[Datapoint N][Index][Footer]

Each Datapoint:
  [meta_length: 4 bytes, uint32]
  [tensor_length: 4 bytes, uint32]
  [metadata: msgpack bytes]
  [tensors: safetensors bytes]

Footer:
  [index_offset: 8 bytes, uint64]
  [magic: 4 bytes, 'AURA']

Security: Uses safetensors (not pickle) — safe to load untrusted files.


💻 System Requirements

Use Case Files RAM Time
Personal docs 50–500 ~2 GB < 1 min
Team knowledge base 500–5,000 ~4 GB 5–15 min
Enterprise corpus 5,000–50,000 ~8 GB 30–60 min

Platforms: macOS, Windows, Linux
Python: 3.8+


📜 License

Apache License 2.0


🔗 Links


Made with ❤️ by Auralith Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auralith_aura-0.1.0.tar.gz (39.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auralith_aura-0.1.0-py3-none-any.whl (41.2 kB view details)

Uploaded Python 3

File details

Details for the file auralith_aura-0.1.0.tar.gz.

File metadata

  • Download URL: auralith_aura-0.1.0.tar.gz
  • Upload date:
  • Size: 39.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for auralith_aura-0.1.0.tar.gz
Algorithm Hash digest
SHA256 58c9dc8c2dabcc50f8afe7f9fc93f9da209f7f53833ef911c135f9d5b72afcfd
MD5 25110dd9b41c80415d95149dbf8424ab
BLAKE2b-256 4033872acc3e98a3c4067e95ee60f2ec7a98acf55d32ef6ead6bbbdf91d5a5c1

See more details on using hashes here.

Provenance

The following attestation bundles were made for auralith_aura-0.1.0.tar.gz:

Publisher: publish.yml on Auralith-Inc/aura-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file auralith_aura-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: auralith_aura-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 41.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for auralith_aura-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f2ed6783322deb4ea457cb28e1215b22ddc9df34633c7b55c3051bcad42558d4
MD5 5f4e13572bad5843ae5ba89ee1d4f01c
BLAKE2b-256 40e679f08fa86416a5b46a5b0cb09629152c1df3bb0963aa70efe263dd17f920

See more details on using hashes here.

Provenance

The following attestation bundles were made for auralith_aura-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Auralith-Inc/aura-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page