Local-first, privacy-first agent memory — Lite edition (Apache-2.0)
Project description
AIngram (Lite)
Local-first agent memory in one SQLite file: hybrid vector + FTS5 search, optional knowledge graph (entities and relationships), Ed25519-signed entries, and an optional MCP server. Apache-2.0. No cloud dependency; embeddings run via ONNX (Nomic) on your machine.
Install
pip install aingram
Optional extras:
| Extra | Purpose |
|---|---|
aingram[extraction] |
GLiNER entity extraction (background linking) |
aingram[llm] |
HTTP client for Ollama / local LLM (e.g. consolidation) |
aingram[mcp] |
MCP server (FastMCP) |
aingram[api] |
Anthropic API (Sonnet extractor) |
aingram[all] |
mcp, extraction, llm, and cli-related deps |
aingram[gpu] |
CUDA 12 pip wheels (cuFFT, cuBLAS, cuDNN, runtime) ONNX Runtime needs on Windows/Linux |
The aingram CLI is available once the package is installed (Typer is a core dependency).
CUDA / GPU (embeddings): The default package uses the CPU build of ONNX Runtime. Do not install onnxruntime and onnxruntime-gpu together. Typical GPU setup:
pip uninstall -y onnxruntime onnxruntime-gputhenpip install onnxruntime-gpupip install "aingram[gpu]"— includesnvidia-cufft-cu12(ORT’s CUDA EP often fails withcufft64_11.dll/ cuFFT missing if you only installed cuBLAS/cuDNN/runtime)
Set AINGRAM_ONNX_PROVIDER=cuda (or [onnx_provider] in config.toml) if you want to force CUDA instead of auto. See ONNX Runtime GPU install for full CUDA/cuDNN notes.
Quick start (Python)
from aingram import MemoryStore
with MemoryStore('./agent_memory.db') as mem:
mem.remember('User prefers dark mode and concise answers')
for r in mem.recall('What does the user prefer?', limit=5):
print(r.score, r.entry.content)
CLI
aingram --db ./agent_memory.db status
aingram --db ./agent_memory.db add "User likes Python"
aingram --db ./agent_memory.db search "Python"
aingram --db ./agent_memory.db entities
aingram --db ./agent_memory.db graph "Alice"
aingram --db ./agent_memory.db consolidate
aingram --db ./agent_memory.db compact --yes --target-dim 256
aingram --db ./agent_memory.db export ./backup.json
aingram --db ./agent_memory.db import ./backup.json
compact is one-way (e.g. 768 → 256 embedding width). Use --yes to confirm.
Configuration
Precedence (highest first): constructor kwargs → environment variables → ~/.aingram/config.toml → defaults.
| Env var | Meaning |
|---|---|
AINGRAM_MODELS_DIR |
Model cache directory |
AINGRAM_EMBEDDING_DIM |
Width for new DBs (must match an existing DB after creation) |
AINGRAM_LLM_URL |
Ollama base URL |
AINGRAM_LLM_MODEL |
Default LLM name |
AINGRAM_LOG_LEVEL |
aingram logger level |
AINGRAM_WORKER_ENABLED |
true / false |
AINGRAM_EXTRACTOR_MODE |
none, local, or sonnet |
AINGRAM_EXTRACTOR_MODEL |
Extractor model id |
AINGRAM_ONNX_PROVIDER |
cpu, cuda, npu, or omit for auto |
AINGRAM_TELEMETRY_ENABLED |
true / false — anonymous CLI usage telemetry |
Example ~/.aingram/config.toml:
embedding_dim = 768
worker_enabled = true
models_dir = "C:/Users/me/.aingram/models"
llm_url = "http://localhost:11434"
llm_model = "mistral"
extractor_mode = "none"
telemetry_enabled = true
Use AIngramConfig and load_merged_config() for the same rules in application code.
Privacy and anonymous telemetry
CLI only: the aingram command-line tool may send events. Using the Python API or MCP does not use this channel.
The CLI may send anonymous usage events by default: a random install id (~/.aingram/telemetry_id), the name of the top-level command you ran (e.g. add, status), and the package version. No memory text, queries, paths, or secrets are included. Events are sent over HTTPS to https://api.aingram.dev/v1/telemetry.
Opt out (any one):
- Add
--no-telemetryto a single invocation, e.g.aingram --no-telemetry --db ./agent_memory.db status. - Set
telemetry_enabled = falsein~/.aingram/config.toml. - Set
AINGRAM_TELEMETRY_ENABLED=falsein the environment.
This is separate from any future opt-in “contribute training examples” feature, which would involve content you deliberately choose to share.
Consolidation and LLM
Pass an LLM instance into MemoryStore.consolidate(llm=...) when you want richer merge/synthesis behavior (optional). Install aingram[llm] and use your stack’s Ollama or other client as appropriate.
Export / import
MemoryStore.export_json(path) writes a Lite JSON backup (sessions, chains, entries, graph, vectors). import_json(path) targets an empty database by default, or merge=True to skip entries that already exist. Export and DB embedding_dim must match on import.
For programmatic verification of a session chain, use MemoryStore.verify().
MCP
With aingram[mcp] installed, see aingram.mcp_server.create_server for tools such as remember, recall, reference, verify, and get_experiment_context, with optional bearer-token middleware.
Examples
Runnable scripts using MemoryStore.remember / MemoryStore.recall live in examples/. See examples/README.md.
Benchmarks (local)
From a clone of this repo, synthetic DBs and timing scripts live under scripts/:
python scripts/seed_bench_db.py— createsbenchmarks/bench_*.dbpython scripts/bench.py— runs recall and embed vs vector-search breakdowns on those files
When the embedding model is first used, huggingface_hub may print a warning that you are sending unauthenticated requests to the Hugging Face Hub. That is expected and not a functional problem — downloads still work locally. Optionally set a HF_TOKEN in the environment for higher rate limits if you hit throttling.
Development
pip install -e ".[dev,all]"
pytest
ruff check aingram/ && ruff format --check aingram/
Guidelines and pre-push hygiene (including secret scanning) are in CONTRIBUTING.md.
Python 3.11+.
Social
Join the official Aingram discord here: https://discord.gg/zSJCFZnXxf
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aingram-1.1.0.tar.gz.
File metadata
- Download URL: aingram-1.1.0.tar.gz
- Upload date:
- Size: 765.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32d1ec3a271fb9cda05a2432e634d67470a30f364b390a940d4c6f2322a1ab0d
|
|
| MD5 |
e78e63171597352151dcab115a9749e1
|
|
| BLAKE2b-256 |
f241e133194e62c1ddc7c3c0b9e44908a4c18e3c1126d90d5de05ab0c913f38e
|
File details
Details for the file aingram-1.1.0-py3-none-any.whl.
File metadata
- Download URL: aingram-1.1.0-py3-none-any.whl
- Upload date:
- Size: 189.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0277b17b0f130ba2f36907502f23376b5312259770f4a0774aa235fd8a0e4f90
|
|
| MD5 |
e76b3b3bff59a86e543bb2eabbe81c85
|
|
| BLAKE2b-256 |
5604e4422efdf5874c57272bb64970c555fa46d124cf7288c98aed745377c562
|