Skip to main content

Conversation memory as a knowledge graph โ€” pipeline, retrieval, and generation

Project description

MemOrai

PyPI version Python 3.10+ License: MIT

Build knowledge graphs from conversations and answer questions using retrieval-augmented generation over a knowledge graph.


๐Ÿ“ฆ Installation

From PyPI

pip install memorai

With FastAPI backend extras

pip install "memorai[backend]"

From source (editable install)

git clone https://github.com/memorai/memorai.git
cd memorai
pip install -e .
# or with backend extras:
pip install -e ".[backend]"

โš™๏ธ Configuration

Create a .env file (or set environment variables) before running:

# LLM Configuration (OpenAI-compatible endpoint)
LLM_API_KEY=your-api-key-here
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_MODEL=google/gemini-2.0-flash-001

# Embedding Model (HuggingFace model name)
EMBEDDING_MODEL=BAAI/bge-m3

# Database Configuration (Neo4j)
# Use Aura DB or a local Neo4j instance
NEO4J_URI=neo4j+s://your-id.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password

๐Ÿš€ Quick Start

MemOrai uses Neo4j as a unified storage backend. It acts both as a Document Store (for pipeline states) and a Knowledge Graph. No local files are needed for retrieval once indexed.

Python API

import memorai

# 1. Initialize a conversation scope
memorai.create_conversation(
    conversation_id="alice-bot",
    name="Alice - Support Bot",
)

# 2. Index conversation history (builds graph in Neo4j)
history = [
    {"role": "user", "content": "My name is Alice and I live in Hanoi."},
    {"role": "assistant", "content": "Nice to meet you, Alice!"},
]

memorai.index(
    history=history,
    conversation_id="alice-bot",
    session_id="alice-session-001",
    update=True,
    fast_mode=True,
)

# 3. Retrieve using Graph Vector Search
result = memorai.retrieve(
    query="Where does Alice live?",
    conversation_id="alice-bot",
)
print(result["top_turn_contents"])

Notes:

  • conversation_id isolates tenant data in Neo4j.
  • session_id lets you append incremental chat batches inside one conversation scope.
  • fast_mode=True runs low-latency indexing (skips heavy post-processing).

CLI โ€” Full pipeline

# Run full pipeline from a JSON file
memorai pipeline \
    --input_json data/conversations.json \
    --output_dir output \
    --save_embeddings \
    --cleanup

# Answer a single question
memorai qa \
    --data_path output/graph_db/session-001 \
    --query "Where does Alice live?"

# Batch QA
memorai qa-batch \
    --questions_file questions.json \
    --data_path output/graph_db/session-001 \
    --output answers.json

๐Ÿ“‹ CLI Commands

Pipeline commands

Command Description
memorai segment Segment conversations into turns
memorai filter Filter important messages
memorai triplets Extract knowledge triplets
memorai entities Generate entity descriptions
memorai summarize Summarize segments
memorai graph Build knowledge graph
memorai pipeline Run full pipeline end-to-end

Post-processing commands

Command Description
memorai segment-chunk-map Export segment โ†’ chunk mapping
memorai consolidate-turns Deduplicate turn IDs
memorai rebuild-graph Rebuild graph after consolidation
memorai embed-turns Add turn embeddings
memorai embed-entities Add entity embeddings
memorai embed-triplets Add triplet embeddings
memorai embed-summaries Add summary embeddings

QA commands

Command Description
memorai retrieve Retrieve relevant nodes from KG
memorai qa Answer a single question
memorai qa-batch Answer a batch of questions

๐Ÿ—๏ธ Architecture

Conversation History
        โ”‚
        โ–ผ
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚  Segmenter  โ”‚  Split into semantic turns
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚   Filter    โ”‚  Remove low-signal messages
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚ TripletExtractorโ”‚  Extract (entity, relation, entity)
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚ EntityDescriptorโ”‚  Describe entities in context
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚ GraphBuilderโ”‚  Build Knowledge Graph in Neo4j
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚ Neo4jRetrieverโ”‚  Vector Search + Cypher Traversal
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚ AnswerGeneratorโ”‚  RAG over Neo4j context
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”ง Development

# Install with dev extras
pip install -e ".[dev]"

# Run tests
pytest

# Build distribution
make build

# Check package
twine check dist/*

# Publish to PyPI
make publish

๐Ÿ“ค Publish Guide (DIY)

Use this section when you want to publish manually.

1. Prepare account + API tokens

  1. Create an account on PyPI and TestPyPI.
  2. Create API token on TestPyPI (for dry-run upload).
  3. Create API token on PyPI (real release).
  4. Keep tokens in env vars (recommended):
export TWINE_USERNAME=__token__
export TWINE_PASSWORD=pypi-<your-token>

2. Build package artifacts

python -m pip install --upgrade build twine
rm -rf dist build *.egg-info
python -m build
python -m twine check dist/*

Expected artifacts:

  • dist/memorai-<version>.tar.gz
  • dist/memorai-<version>-py3-none-any.whl

3. Upload to TestPyPI first

export TWINE_USERNAME=__token__
export TWINE_PASSWORD=pypi-<your-testpypi-token>
python -m twine upload --repository testpypi dist/*

Install test package:

python -m pip install \
    --index-url https://test.pypi.org/simple/ \
    --extra-index-url https://pypi.org/simple \
    memorai

4. Publish to real PyPI

export TWINE_USERNAME=__token__
export TWINE_PASSWORD=pypi-<your-pypi-token>
python -m twine upload dist/*

5. Verify release

python -m pip install --upgrade memorai
python -c "import memorai; print(memorai.__version__)"

Common issues

  • File already exists: bump version in pyproject.toml and memorai/__init__.py, then rebuild.
  • 403 invalid token: ensure token scope matches target index (PyPI vs TestPyPI).
  • Long README render errors: run python -m twine check dist/* before upload.

๐Ÿ“„ License

MIT โ€” see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memorai-0.1.0.tar.gz (63.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memorai-0.1.0-py3-none-any.whl (75.8 kB view details)

Uploaded Python 3

File details

Details for the file memorai-0.1.0.tar.gz.

File metadata

  • Download URL: memorai-0.1.0.tar.gz
  • Upload date:
  • Size: 63.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for memorai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b7dd4187c347b6d2566f18be281893cd8f00b551774cdd3e4b72ba854775c045
MD5 36da55137b3eb29db481f35f01b28600
BLAKE2b-256 a2122ffe52d7a7d59a30420555d4c4f16b0f9d7d485c9c1c9595521317d5b0a4

See more details on using hashes here.

File details

Details for the file memorai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: memorai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 75.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for memorai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 65147660b65523bf15797b82338a39af7a5a42b12a7f28cfb11cb9008b2db249
MD5 0c15d9e8e3874b42c568cbe8646d1359
BLAKE2b-256 c61180767b4ca4e08245127ee575b6c6d712a871365154e46fe512bbfea9c62d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page