Conversation memory as a knowledge graph โ pipeline, retrieval, and generation
Project description
MemOrai
Build knowledge graphs from conversations and answer questions using retrieval-augmented generation over a knowledge graph.
๐ฆ Installation
From PyPI
pip install memorai
With FastAPI backend extras
pip install "memorai[backend]"
From source (editable install)
git clone https://github.com/memorai/memorai.git
cd memorai
pip install -e .
# or with backend extras:
pip install -e ".[backend]"
โ๏ธ Configuration
Copy .env.example to .env (or set environment variables directly) before running:
cp .env.example .env
Then edit .env with your real credentials.
Example:
# LLM Configuration (OpenAI-compatible endpoint)
LLM_API_KEY=your-api-key-here
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_MODEL=google/gemini-2.0-flash-001
# Embedding Model (HuggingFace model name)
EMBEDDING_MODEL=BAAI/bge-m3
# Database Configuration (Neo4j)
# Use Aura DB or a local Neo4j instance
NEO4J_URI=neo4j+s://your-id.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
๐ Quick Start
MemOrai uses Neo4j as a unified storage backend. It acts both as a Document Store (for pipeline states) and a Knowledge Graph. No local files are needed for retrieval once indexed.
Python API
import memorai
# Optional: configure at runtime instead of relying on os.getenv
memorai.configure(
llm_provider="groq",
llm_api_key="<your-llm-key>",
llm_model="llama-3.3-70b-versatile",
embedding_provider="cloudflare",
cloudflare_api_token="<your-cf-token>",
cloudflare_account_id="<your-cf-account-id>",
neo4j_uri="neo4j+s://<your-instance>.databases.neo4j.io",
neo4j_user="neo4j",
neo4j_password="<your-neo4j-password>",
max_workers=4,
rpm_limit=60,
timeout=120,
)
# 1. Initialize a conversation scope
memorai.create_conversation(
conversation_id="alice-bot",
name="Alice - Support Bot",
)
# 2. Index conversation history (builds graph in Neo4j)
history = [
{"role": "user", "content": "My name is Alice and I live in Hanoi."},
{"role": "assistant", "content": "Nice to meet you, Alice!"},
]
memorai.index(
history=history,
conversation_id="alice-bot",
session_id="alice-session-001",
update=True,
fast_mode=True,
)
# 3. Retrieve using Graph Vector Search
result = memorai.retrieve(
query="Where does Alice live?",
conversation_id="alice-bot",
)
print(result["top_turn_contents"])
Notes:
conversation_idisolates tenant data in Neo4j.session_idlets you append incremental chat batches inside one conversation scope.fast_mode=Trueruns low-latency indexing (skips heavy post-processing).
CLI โ Full pipeline
# Run full pipeline from a JSON file
memorai pipeline \
--input_json data/conversations.json \
--output_dir output \
--save_embeddings \
--cleanup
# Answer a single question
memorai qa \
--data_path output/graph_db/session-001 \
--query "Where does Alice live?"
# Batch QA
memorai qa-batch \
--questions_file questions.json \
--data_path output/graph_db/session-001 \
--output answers.json
๐ CLI Commands
Pipeline commands
| Command | Description |
|---|---|
memorai segment |
Segment conversations into turns |
memorai filter |
Filter important messages |
memorai triplets |
Extract knowledge triplets |
memorai entities |
Generate entity descriptions |
memorai summarize |
Summarize segments |
memorai graph |
Build knowledge graph |
memorai pipeline |
Run full pipeline end-to-end |
Post-processing commands
| Command | Description |
|---|---|
memorai segment-chunk-map |
Export segment โ chunk mapping |
memorai consolidate-turns |
Deduplicate turn IDs |
memorai rebuild-graph |
Rebuild graph after consolidation |
memorai embed-turns |
Add turn embeddings |
memorai embed-entities |
Add entity embeddings |
memorai embed-triplets |
Add triplet embeddings |
memorai embed-summaries |
Add summary embeddings |
QA commands
| Command | Description |
|---|---|
memorai retrieve |
Retrieve relevant nodes from KG |
memorai qa |
Answer a single question |
memorai qa-batch |
Answer a batch of questions |
๐๏ธ Architecture
Conversation History
โ
โผ
โโโโโโโโโโโโโโโ
โ Segmenter โ Split into semantic turns
โโโโโโโโฌโโโโโโโ
โ
โโโโโโโโผโโโโโโโ
โ Filter โ Remove low-signal messages
โโโโโโโโฌโโโโโโโ
โ
โโโโโโโโผโโโโโโโโโโโ
โ TripletExtractorโ Extract (entity, relation, entity)
โโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโผโโโโโโโโโโโ
โ EntityDescriptorโ Describe entities in context
โโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโผโโโโโโโ
โ GraphBuilderโ Build Knowledge Graph in Neo4j
โโโโโโโโฌโโโโโโโ
โ
โโโโโโโโผโโโโโโโโโ
โ Neo4jRetrieverโ Vector Search + Cypher Traversal
โโโโโโโโฌโโโโโโโโโ
โ
โโโโโโโโผโโโโโโโโโ
โ AnswerGeneratorโ RAG over Neo4j context
โโโโโโโโโโโโโโโโโ
๐ง Development
# Install with dev extras
pip install -e ".[dev]"
# Run tests
pytest
# Build distribution
make build
# Check package
twine check dist/*
# Publish to PyPI
make publish
๐ค Publish Guide (DIY)
Use this section when you want to publish manually.
1. Prepare account + API tokens
- Create an account on PyPI and TestPyPI.
- Create API token on TestPyPI (for dry-run upload).
- Create API token on PyPI (real release).
- Keep tokens in env vars (recommended):
export TWINE_USERNAME=__token__
export TWINE_PASSWORD=pypi-<your-token>
2. Build package artifacts
python -m pip install --upgrade build twine
rm -rf dist build *.egg-info
python -m build
python -m twine check dist/*
Expected artifacts:
dist/memorai-<version>.tar.gzdist/memorai-<version>-py3-none-any.whl
3. Upload to TestPyPI first
export TWINE_USERNAME=__token__
export TWINE_PASSWORD=pypi-<your-testpypi-token>
python -m twine upload --repository testpypi dist/*
Install test package:
python -m pip install \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple \
memorai
4. Publish to real PyPI
export TWINE_USERNAME=__token__
export TWINE_PASSWORD=pypi-<your-pypi-token>
python -m twine upload dist/*
5. Verify release
python -m pip install --upgrade memorai
python -c "import memorai; print(memorai.__version__)"
Common issues
File already exists: bump version inpyproject.tomlandmemorai/__init__.py, then rebuild.403 invalid token: ensure token scope matches target index (PyPI vs TestPyPI).- Long README render errors: run
python -m twine check dist/*before upload.
๐ License
MIT โ see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memorai-0.1.3.tar.gz.
File metadata
- Download URL: memorai-0.1.3.tar.gz
- Upload date:
- Size: 64.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45ef96cb2f29348b15376c96a97036f47551cf3e81bdebabe8784f2cd4ef0b5e
|
|
| MD5 |
70fe82cbd72cffe0fa20b612377ff3f6
|
|
| BLAKE2b-256 |
50bce5b9742315b62dc3c6d25b7a90cdc1753b95b7c925ee87b61b18a467b2a4
|
File details
Details for the file memorai-0.1.3-py3-none-any.whl.
File metadata
- Download URL: memorai-0.1.3-py3-none-any.whl
- Upload date:
- Size: 77.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef83fd47f7bcefc5686d3d7b2a4a0dac13cfa65fb21793f89a08ed39c44dbd65
|
|
| MD5 |
04e9c7c60002a3564f9710728a04f56f
|
|
| BLAKE2b-256 |
8fbc42d770672865f73a9b3b3e4bc7c1025e960837548181ab766947176f6d46
|