Mem-LLM is a Python framework for building privacy-first, memory-enabled AI assistants that run 100% locally.

Project description

Mem-LLM

Mem-LLM is a Python framework for building privacy-first, memory-enabled AI assistants that run 100% locally. The project combines persistent multi-user conversation history with optional knowledge bases, multiple storage backends, vector search capabilities, response quality metrics, and tight integration with Ollama and LM Studio so you can experiment locally and deploy production-ready workflows with quality monitoring and semantic understanding - completely private and offline.

🆕 What's New in v2.4.0

🔧 Maintenance & Packaging

✅ Release v2.4.0 — Bumped package metadata and published to PyPI.
✅ Author & Metadata Fixes — Corrected author name and updated packaging metadata and descriptions.
✅ Docs Refresh — README updated for clarity and current features.

🔧 Other improvements

✅ Compatibility — Ensured Python support for 3.8+ and improved packaging workflows.
✅ Cleanup — Removed outdated local build artifacts from dist/ before publishing.

🆕 What's New in v2.3.0 - "Neural Nexus"

⚙️ Agent Workflow Engine (NEW)

✅ Structured Agents - Define multi-step workflows like "Deep Research" or "Content Creation".
✅ Streaming UI - Real-time visualization of workflow steps as they execute.
✅ Context Sharing - Data flows automatically between steps in a workflow.

🕸️ Knowledge Graph Memory (NEW)

✅ Graph Extraction - Automatically extracts entities and relationships from conversations.
✅ Interactive Visualization - View your agent's knowledge graph in the new Web UI tab.
✅ NetworkX Integration - Powerful graph operations and persistence.

🎨 Premium Web UI (Redesigned)

✅ Modern Aesthetics - Dark mode, glassmorphism, and responsive design.
✅ New Features - File uploads (📎) and Workflow Management tab.
✅ LM Studio Integration - Auto-configuration for local models like gemma-3-4b.

What's New in v2.2.3

🧠 Hierarchical Memory System (NEW - Major Feature)

✅ 4-Layer Cognitive Architecture - Episode, Trace, Category, and Domain layers
✅ Auto-Categorization - Intelligent topic detection and classification
✅ Context Injection - Smarter, more relevant context for LLMs
✅ Backward Compatible - Works seamlessly with existing memory systems

What's New in v2.2.0

🤖 Multi-Agent Systems (NEW - Major Feature)

✅ Collaborative AI Agents - Multiple specialized agents working together
✅ BaseAgent - Role-based agents (Researcher, Analyst, Writer, Validator, Coordinator)
✅ AgentRegistry - Centralized agent management and health monitoring
✅ CommunicationHub - Thread-safe inter-agent messaging and broadcast channels
✅ 29 New Tests - Comprehensive test coverage (84-98%)

What's New in v2.1.4

📊 Conversation Analytics (NEW)

✅ Deep Insights - Analyze user engagement, topics, and activity patterns
✅ Visual Reports - Export analytics to JSON, CSV, or Markdown
✅ Engagement Tracking - Monitor active days, session length, and interaction frequency

📋 Config Presets (NEW)

✅ Instant Setup - Initialize specialized agents with one line of code
✅ 8 Built-in Presets - chatbot, code_assistant, creative_writer, tutor, analyst, translator, summarizer, researcher
✅ Custom Presets - Save and reuse your own agent configurations

What's New in v2.1.3

🚀 Enhanced Tool Execution

✅ Smart parser - Understands natural language tool calls
✅ Better prompts - Clear DO/DON'T examples for LLM
✅ More reliable - Tools execute even when LLM doesn't follow exact format
Function Calling (v2.0.0) – LLMs can call external Python functions
Memory-Aware Tools (v2.0.0) – Agents search their own conversation history
18+ Built-in Tools (v2.0.0) – Math, text, file, utility, memory, and async tools
Custom Tools (v2.0.0) – Easy @tool decorator for your functions
Tool Chaining (v2.0.0) – Automatic multi-tool workflows

Core Features

100% Local & Private (v1.3.6) – No cloud dependencies, all processing on your machine.
Streaming Response (v1.3.3+) – Real-time ChatGPT-style streaming for Ollama and LM Studio.
REST API Server (v1.3.3+) – FastAPI-based server with WebSocket and SSE streaming support.
Web UI (v1.3.3+) – Modern 3-page interface (Chat, Memory Management, Metrics Dashboard).
Persistent Memory – Store and recall conversation history across sessions for each user.
Multi-Backend Support (v1.3.0+) – Choose between Ollama and LM Studio with unified API.
Auto-Detection (v1.3.0+) – Automatically find and use available local LLM service.
Response Metrics (v1.3.1+) – Track confidence, latency, KB usage, and quality analytics.
Vector Search (v1.3.2+) – Semantic search with ChromaDB, cross-lingual support.
Flexible Storage – Choose between lightweight JSON files or a SQLite database for production scenarios.
Knowledge Bases – Load categorized Q&A content to augment model responses with authoritative answers.
Dynamic Prompting – Automatically adapts prompts based on the features you enable, reducing hallucinations.
CLI & Tools – Includes a command-line interface plus utilities for searching, exporting, and auditing stored memories.
Security Features (v1.1.0+) – Prompt injection detection with risk-level assessment (opt-in).
High Performance (v1.1.0+) – Thread-safe operations with 16K+ msg/s throughput, <1ms search latency.
Conversation Summarization (v1.2.0+) – Automatic token compression (~40-60% reduction).
Multi-Database Support (v1.2.0+) – Export/import to PostgreSQL, MongoDB, JSON, CSV, SQLite.

Repository Layout

Memory LLM/ – Core Python package (mem_llm), configuration examples, packaging metadata, and detailed module-level documentation.
examples/ – Sample scripts that demonstrate common usage patterns.
LICENSE – MIT license for the project.

Looking for API docs or more detailed examples? Start with Memory LLM/README.md, which includes extensive usage guides, configuration options, and advanced workflows.

Quick Start

1. Installation

pip install mem-llm

# Or with optional features
pip install mem-llm[databases]  # PostgreSQL + MongoDB
pip install mem-llm[postgresql]  # PostgreSQL only
pip install mem-llm[mongodb]     # MongoDB only

# Vector search support (v1.3.2+)
pip install chromadb sentence-transformers

2. Choose Your Backend

Option A: Ollama (Local, Free)

# Install Ollama from https://ollama.ai
ollama pull granite4:3b
ollama serve

Option B: LM Studio (Local, GUI)

# Download from https://lmstudio.ai
# Load a model and start server

3. Create and Chat

from mem_llm import MemAgent

# Option A: Ollama
agent = MemAgent(backend='ollama', model="granite4:3b")

# Option B: LM Studio
agent = MemAgent(backend='lmstudio', model="local-model")

# Option C: Auto-detect
agent = MemAgent(auto_detect_backend=True)

# Use it!
agent.set_user("alice")
print(agent.chat("My name is Alice and I love Python!"))
print(agent.chat("What do I love?"))  # Agent remembers!

# Streaming response (v1.3.3+)
for chunk in agent.chat_stream("Tell me a story"):
    print(chunk, end="", flush=True)

# NEW in v2.0.0: Function calling with tools
agent = MemAgent(enable_tools=True)
agent.set_user("alice")
agent.chat("Calculate (25 * 4) + 10")  # Uses built-in calculator
agent.chat("Search my memory for 'Python'")  # Uses memory tool

# NEW in v2.1.0: Async tools & validation
from mem_llm import tool

@tool(
    name="send_email",
    pattern={"email": r'^[\w\.-]+@[\w\.-]+\.\w+$'}  # Email validation
)
def send_email(email: str) -> str:
    return f"Email sent to {email}"

4. Web UI & REST API (v1.3.3+)

# Install with API support
pip install mem-llm[api]

# Start API server (serves Web UI automatically)
python -m mem_llm.api_server

# Or use dedicated launcher
mem-llm-web

# Access Web UI at:
# http://localhost:8000          - Chat interface
# http://localhost:8000/memory   - Memory management
# http://localhost:8000/metrics  - Metrics dashboard
# http://localhost:8000/docs     - API documentation

Multi-Backend Examples (v1.3.0+)

from mem_llm import MemAgent

# LM Studio - Fast local inference with GUI
agent = MemAgent(
    backend='lmstudio',
    model='local-model',
    base_url='http://localhost:1234'
)

# Auto-detect - Use any available local backend
agent = MemAgent(auto_detect_backend=True)

# Advanced features still work!
agent = MemAgent(
    backend='ollama',           # NEW in v1.3.0
    model="granite4:3b",
    use_sql=True,              # Thread-safe SQLite storage
    enable_security=True       # Prompt injection protection
)

For advanced configuration (SQL storage, knowledge base support, business mode, etc.), copy config.yaml.example from the package directory and adjust it for your environment.

Test Coverage (v2.1.1)

✅ 20+ examples demonstrating all features
✅ Function Calling (3 examples - basic, memory tools, async+validation)
✅ Ollama and LM Studio backends (14 tests)
✅ Conversation Summarization (5 tests)
✅ Data Export/Import (11 tests - JSON, CSV, SQLite, PostgreSQL, MongoDB)
✅ Core MemAgent functionality (5 tests)
✅ Factory pattern and auto-detection (4 tests)

Performance

Write Throughput: 16,666+ records/sec
Search Latency: <1ms for 500+ conversations
Token Compression: 40-60% reduction with summarization (v1.2.0+)
Thread-Safe: Full RLock protection on all SQLite operations
Multi-Database: Seamless export/import across 5 formats (v1.2.0+)

Contributing

Contributions, bug reports, and feature requests are welcome! Please open an issue or submit a pull request describing your changes. Make sure to include test coverage and follow the formatting guidelines enforced by the existing codebase.

License

Mem-LLM is released under the MIT License.

Project details

Release history Release notifications | RSS feed

2.4.8

Mar 5, 2026

2.4.6

Feb 27, 2026

2.4.5

Feb 27, 2026

2.4.4

Feb 27, 2026

2.4.3

Feb 8, 2026

2.4.2

Jan 20, 2026

2.4.1

Jan 20, 2026

This version

2.4.0

Dec 25, 2025

2.3.8

Dec 25, 2025

2.3.7

Dec 25, 2025

2.3.6

Dec 25, 2025

2.3.5

Dec 25, 2025

2.3.4

Dec 25, 2025

2.3.3

Dec 19, 2025

2.3.2

Dec 19, 2025

2.3.1

Dec 13, 2025

2.3.0

Dec 13, 2025

2.2.9

Dec 7, 2025

2.2.8

Dec 7, 2025

2.2.7

Dec 7, 2025

2.2.6

Dec 7, 2025

2.2.5

Dec 7, 2025

2.2.4

Dec 7, 2025

2.2.3

Dec 7, 2025

2.2.2

Dec 1, 2025

2.2.1

Nov 30, 2025

2.2.0

Nov 30, 2025

2.1.6

Nov 20, 2025

2.1.5

Nov 20, 2025

2.1.4

Nov 20, 2025

2.1.3

Nov 10, 2025

2.1.2

Nov 10, 2025

2.1.1

Nov 10, 2025

2.1.0

Nov 10, 2025

2.0.0

Nov 10, 2025

1.3.6

Nov 10, 2025

1.3.5

Nov 10, 2025

1.3.4

Nov 10, 2025

1.3.3

Nov 9, 2025

1.3.2

Nov 2, 2025

1.3.1

Oct 31, 2025

1.3.0

Oct 31, 2025

1.2.0

Oct 21, 2025

1.1.0

Oct 20, 2025

1.0.11

Oct 20, 2025

1.0.10

Oct 20, 2025

1.0.7

Oct 13, 2025

1.0.6

Oct 13, 2025

1.0.5

Oct 13, 2025

1.0.4

Oct 13, 2025

1.0.3

Oct 13, 2025

1.0.2

Oct 13, 2025

1.0.1

Oct 13, 2025

1.0.0

Oct 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mem_llm-2.4.0.tar.gz (130.3 kB view details)

Uploaded Dec 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mem_llm-2.4.0-py3-none-any.whl (133.0 kB view details)

Uploaded Dec 25, 2025 Python 3

File details

Details for the file mem_llm-2.4.0.tar.gz.

File metadata

Download URL: mem_llm-2.4.0.tar.gz
Upload date: Dec 25, 2025
Size: 130.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mem_llm-2.4.0.tar.gz
Algorithm	Hash digest
SHA256	`32595423bfa75a6c20c9c5cae3142664154c62f56aadc4016f2b196ed9490086`
MD5	`c6b689a104a632e47185dfcbd8387784`
BLAKE2b-256	`833bcecf1816b1b6827765555f3ba6a6ebb43e747e3b515eb63f16cf1b515260`

See more details on using hashes here.

File details

Details for the file mem_llm-2.4.0-py3-none-any.whl.

File metadata

Download URL: mem_llm-2.4.0-py3-none-any.whl
Upload date: Dec 25, 2025
Size: 133.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mem_llm-2.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`303380d174372870e5ef07f166c54fa4fc94f4c3e5f28a70293e1ea1ede759c2`
MD5	`0b0283a2a97709c5e0fe195f1fe5730e`
BLAKE2b-256	`21e8dbac5342425944ea1a9108dfbe3e0293f04e2b1083ed2c23a14b5acaed0c`

See more details on using hashes here.

mem-llm 2.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Mem-LLM

🆕 What's New in v2.4.0

🔧 Maintenance & Packaging

🔧 Other improvements

🆕 What's New in v2.3.0 - "Neural Nexus"

⚙️ Agent Workflow Engine (NEW)

🕸️ Knowledge Graph Memory (NEW)

🎨 Premium Web UI (Redesigned)

What's New in v2.2.3

🧠 Hierarchical Memory System (NEW - Major Feature)

What's New in v2.2.0

🤖 Multi-Agent Systems (NEW - Major Feature)

What's New in v2.1.4

📊 Conversation Analytics (NEW)

📋 Config Presets (NEW)

What's New in v2.1.3

🚀 Enhanced Tool Execution

Core Features

Repository Layout

Quick Start

1. Installation

2. Choose Your Backend

3. Create and Chat

4. Web UI & REST API (v1.3.3+)

Multi-Backend Examples (v1.3.0+)

Test Coverage (v2.1.1)

Performance

Contributing

Links

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes