Skip to main content

LLM-powered research knowledge base — compile raw documents into a living wiki

Project description

Rta Labs Logo

Aura Research

Turn raw research into a living wiki your LLM agents can read, query, and enhance.

PyPI License Powered by Aura Core


Aura Research is an LLM-powered research knowledge base that compiles raw documents into a structured markdown wiki. Drop your papers, articles, data, and notes into a folder — the LLM reads everything, builds a navigable wiki with summaries and concept articles, and then answers your research questions using that compiled knowledge.

Built on Aura Core for document compilation (60+ formats) and the three-tier Memory OS for persistent agent memory across sessions.

Quick Start

# Install
pip install 'aura-research[openai]'

# Set your API key
export OPENAI_API_KEY=sk-...

# Initialize a project
research init my-project
cd my-project

# Drop your documents in raw/
cp ~/papers/*.pdf raw/

# Ingest and compile
research ingest raw/
research compile

# Ask questions
research query "what are the key findings across all papers?"

# Search the wiki
research search "attention mechanism"

# Check wiki health
research lint

Agent-Native Mode (no API key)

If you're already using an AI coding agent (Claude Code, Codex, Gemini CLI, Cursor, etc.), you don't need an API key. The agent IS the LLM:

# In your AI agent's terminal:
research init my-project
# Copy documents to raw/
research ingest raw/

# The agent reads the docs and writes wiki articles directly
# (it's an LLM — it doesn't need to call another one)

research build          # compile wiki/ → wiki.aura
research search "topic" # search the wiki
research memory show    # see what the agent remembers

The API mode (research compile, research query) exists for headless/batch use when no agent is at the keyboard.

How It Works

Raw Documents  ──→  Aura Core (.aura)  ──→  LLM Compiler  ──→  Markdown Wiki
  papers/              compiled &              generates          wiki/
  articles/            indexed                 summaries,         ├── _index.md
  data/                (60+ formats)           concepts,          ├── concepts/
  code/                                        backlinks          ├── sources/
                                                                  └── queries/
                              ↕
                        Memory OS
                   /pad  /episodic  /fact
                   (persistent agent memory)
  1. Ingest — Compile your raw documents into a searchable .aura archive using Aura Core
  2. Compile — LLM reads all sources and generates a structured wiki: per-source summaries, cross-cutting concept articles, master index, and executive summary
  3. Query — Ask questions against the wiki. The LLM uses wiki context + Memory OS facts + optional web search to give you thorough, cited answers
  4. Remember — Memory OS automatically stores key findings (/fact), session logs (/episodic), and working notes (/pad) — so the agent never starts cold

Commands

Command Description
research init [dir] Initialize a new project
research ingest <dir> Ingest raw documents into .aura archive
research ingest <dir> --watch Watch directory and auto-re-ingest on changes
research compile Compile wiki using LLM API (needs API key)
research compile --full Full recompile (ignore cache)
research build Build wiki.aura from wiki/ markdown (no LLM needed)
research query "..." Ask a research question (needs API key)
research query "..." --save Ask and save the answer to wiki/queries/
research query "..." --no-web Ask without web search
research search "..." Keyword search across wiki articles
research lint Run wiki health checks
research lint --ai Health checks + AI-powered analysis
research status Show knowledge base statistics
research memory show Full overview of all 3 memory tiers
research memory show --tier fact Overview filtered to one tier
research memory usage Show Memory OS storage
research memory query "..." Search agent memory
research memory write <tier> "..." Manually write to memory (pad/episodic/fact)
research memory list List memory shards
research memory prune --before DATE Prune old memory entries

LLM Providers

Supports OpenAI, Anthropic, and Google Gemini. Install the one you prefer:

pip install 'aura-research[openai]'       # OpenAI models (default)
pip install 'aura-research[anthropic]'    # Anthropic models
pip install 'aura-research[gemini]'       # Google Gemini models
pip install 'aura-research[all]'          # Everything

Configure via environment variables or research.yaml:

llm:
  provider: openai          # openai, anthropic, or gemini
  model: gpt-5.4-instant    # override default model
  temperature: 0.3

memory:
  enabled: true
  auto_write: true           # agent writes to memory automatically

web_search:
  enabled: true
  max_results: 5

watch:
  enabled: false
  interval: 5               # seconds between checks

Memory OS

Aura's three-tier Memory OS v2.1 gives the agent persistent memory across sessions — so it never starts cold:

Tier What's Stored Persistence
/pad Working notes, draft observations Transient — scratch space
/episodic Session logs, what was compiled/queried Auto-archived
/fact Key findings, verified observations Persistent — survives indefinitely

How It Operates

Memory OS works both autonomously and manually:

  • Autonomous: During research compile, the LLM auto-extracts key facts and writes them to /fact. After every compile/query session, an episodic log → /episodic. Controlled by auto_write: true in config.
  • Manual: Write directly with research memory write <tier> "content" whenever you (or the agent) want to persist something.

v2.1 Features

Feature What It Does
Entry deduplication Prevents writing the same fact twice (SimHash fuzzy matching)
Temporal decay Recent memories score higher in queries — older context naturally fades
Bloom filters Skip irrelevant shards during search — fast even with thousands of entries
Append-only Old entries are never overwritten — new ones are added alongside them
Tiered priority Facts > episodic > pad when returning query results

Examples

# Write a verified fact
research memory write fact "The model achieves 94.2% accuracy on the test set"

# Log what you did this session
research memory write episodic "Analyzed training curves, found overfitting at epoch 12"

# Jot a working note
research memory write pad "TODO: re-run experiment with lower learning rate"

# Search memory by keyword
research memory query "accuracy"

# Full overview — see everything across all 3 tiers
research memory show

# Filter to just facts
research memory show --tier fact

# Storage usage
research memory usage

Web Search

During research query, the agent can search the web to supplement wiki answers with current information. This is enabled by default and uses DuckDuckGo (no API key required).

pip install 'aura-research[search]'

# Query with web search (default)
research query "latest advances in attention mechanisms"

# Query without web search
research query "what does our data show" --no-web

Watch Mode

Auto-detect new files and re-ingest:

pip install 'aura-research[watch]'

# Watch for changes (uses watchdog if installed, falls back to polling)
research ingest ./papers --watch

Wiki Output

The compiled wiki lives in two places:

wiki.aura — The primary artifact. An .aura archive containing all wiki articles, optimized for agent RAG retrieval. Token-efficient — agents read only what's relevant.

wiki/ — Markdown export for human browsing. Open in Obsidian, VS Code, GitHub, or any markdown viewer:

.research/
├── knowledge.aura      ← Raw ingested documents
└── wiki.aura           ← Compiled wiki (agent reads from here)

wiki/
├── _index.md           ← Master index with links to all articles
├── _summary.md         ← Executive summary of the knowledge base
├── concepts/           ← Cross-cutting concept articles
│   ├── attention.md
│   ├── tokenization.md
│   └── ...
├── sources/            ← Per-source summary articles
│   ├── vaswani2017.md
│   ├── devlin2019.md
│   └── ...
└── queries/            ← Saved Q&A responses
    └── ...

After editing wiki articles (or having the agent write them), run research build to recompile wiki.aura.

License

Apache License 2.0 — see LICENSE.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aura_research-0.1.1.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aura_research-0.1.1-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file aura_research-0.1.1.tar.gz.

File metadata

  • Download URL: aura_research-0.1.1.tar.gz
  • Upload date:
  • Size: 28.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for aura_research-0.1.1.tar.gz
Algorithm Hash digest
SHA256 05ed8ff3e7f42ee9529662a95ba68b626850d1b0a4ef18f998cb41d525eea214
MD5 2e0af61a9946e7626082c10675f86081
BLAKE2b-256 fb647ea4afdab09addc5f65ce283cf36cbea3fccd866de32b2c86a867e1810f3

See more details on using hashes here.

File details

Details for the file aura_research-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: aura_research-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for aura_research-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cbb224f02963d99817b6a51883b28aca1d10d4efe903d4a938bfa3c0b5503f83
MD5 2122b2874ce477c6eab9c2f36a877e06
BLAKE2b-256 f50a3a0d95120cf61ab6301a81b994f3ba509afef6e0f3fab945564fff70b1e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page