A library for storing and retrieving embeddings with PGVector (with Chunking)
Project description
Vector Memory
A Python library for efficient vector-based memory storage using PostgreSQL and pgvector. Perfect for LLM applications that need to maintain conversation context and retrieve relevant information.
Features
- Vector similarity search using pgvector
- Easy PostgreSQL integration
- Conversation-based memory management
- Metadata support for rich memory context
- Bulk operations support
- Automatic cleanup of old conversations
Installation
pip install vector-memory
Prerequisites
- PostgreSQL 11 or later
- pgvector extension
To install pgvector on your PostgreSQL instance:
CREATE EXTENSION vector;
Quick Start
from vector_memory import VectorMemory, PostgresVectorStorage
from openai import OpenAI
# Initialize connection
memory = VectorMemory.initialize(
host="localhost",
port=5432,
database="your_db",
user="your_user",
password="your_password",
openai_api_key="your-openai-key"
)
# Add a memory
memory.add_memory(
conversation_id="user_123",
content="The user prefers dark mode in applications."
)
# Retrieve relevant memories
relevant_memories = memory.get_relevant_memories(
conversation_id="user_123",
query="What are the user's UI preferences?",
limit=5
)
# Print retrieved memories
for memory in relevant_memories:
print(memory['content'])
Advanced Usage
With Metadata
# Add memory with metadata
memory.add_memory(
conversation_id="user_123",
content="User selected blue theme",
metadata={
"category": "preferences",
"timestamp": "2024-01-18T10:30:00",
"source": "settings_page"
}
)
# Search with metadata filters
memories = memory.get_relevant_memories(
conversation_id="user_123",
query="What theme was selected?",
filter_metadata={"category": "preferences"}
)
Bulk Operations
# Add multiple memories at once
contents = [
"User opened dashboard",
"User viewed reports",
"User downloaded PDF"
]
metadata_list = [
{"action": "view", "page": "dashboard"},
{"action": "view", "page": "reports"},
{"action": "download", "type": "pdf"}
]
memory.add_memories(
conversation_id="user_123",
contents=contents,
metadata_list=metadata_list
)
Memory Management
# Delete a conversation
memory.delete_conversation("user_123")
# Clean up old conversations
memory.cleanup_old_conversations(days=30)
# Get conversation statistics
stats = memory.get_conversation_stats("user_123")
print(f"Total memories: {stats['total_memories']}")
print(f"Oldest memory: {stats['oldest_memory']}")
Connection Management
Basic Connection
# Using connection parameters
memory = VectorMemory.initialize(
host="localhost",
port=5432,
database="your_db",
user="your_user",
password="your_password",
openai_api_key="your-openai-key"
)
# Using connection string
memory = VectorMemory.initialize_from_string(
"postgresql://user:password@localhost/dbname",
openai_api_key="your-openai-key"
)
Connection Pool
# Initialize with connection pool
memory = VectorMemory.initialize(
host="localhost",
port=5432,
database="your_db",
user="your_user",
password="your_password",
openai_api_key="your-openai-key",
pool_size=5,
max_overflow=10
)
Configuration Options
memory = VectorMemory.initialize(
host="localhost",
port=5432,
database="your_db",
user="your_user",
password="your_password",
openai_api_key="your-openai-key",
# PostgreSQL options
pool_size=5,
max_overflow=10,
pool_timeout=30,
# Memory options
max_tokens=8191, # Maximum tokens per content
embedding_model="text-embedding-ada-002", # OpenAI embedding model
)
Error Handling
from vector_memory.exceptions import StorageError
try:
memory.add_memory(
conversation_id="user_123",
content="Some content"
)
except StorageError as e:
print(f"Failed to store memory: {e}")
Best Practices
- Use descriptive conversation IDs:
conversation_id = f"user_{user_id}_session_{session_id}"
- Include relevant metadata:
metadata = {
"source": "chat",
"timestamp": datetime.now().isoformat(),
"user_role": "admin",
"context": "support_ticket_123"
}
- Implement regular cleanup:
# Run periodically (e.g., daily)
memory.cleanup_old_conversations(days=30)
- Use appropriate similarity thresholds:
memories = memory.get_relevant_memories(
conversation_id="user_123",
query="search query",
threshold=0.7 # Adjust based on your needs (0-1)
)
Performance Tips
- Use bulk operations for multiple memories:
memory.add_memories(conversation_id, contents, metadata_list)
- Index important metadata fields:
CREATE INDEX idx_memory_category ON vector_entries USING btree ((metadata->>'category'));
- Use connection pooling for high-traffic applications:
memory = VectorMemory.initialize(
# ... connection details ...
pool_size=10,
max_overflow=20
)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mem_vault-0.1.0.tar.gz.
File metadata
- Download URL: mem_vault-0.1.0.tar.gz
- Upload date:
- Size: 13.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a90388f61d014a63aeb9f0420e224658a572cae4267866ed7a6dee2697d5161c
|
|
| MD5 |
bca71a7248af6faacc794f3a766dafc3
|
|
| BLAKE2b-256 |
7f14ce31c3e23642c456d5251220c726ff2ebca978172cf9f922296fba2761ed
|
File details
Details for the file mem_vault-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mem_vault-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a905b9a0b0011ad75319a9648ba5c8d6f1b0918af42c352e78367c51ebba0fc9
|
|
| MD5 |
74bacf288462708e57c4d509c6d1fdee
|
|
| BLAKE2b-256 |
307dd70a9e5ce5b02cf0f3b9ac2c333a39491e7f8e6b9c9c6ed0ff063b0d90fc
|