Skip to main content

A library for storing and retrieving embeddings with PGVector (with Chunking)

Project description

Vector Memory

A Python library for efficient vector-based memory storage using PostgreSQL and pgvector. Perfect for LLM applications that need to maintain conversation context and retrieve relevant information.

Features

  • Vector similarity search using pgvector
  • Easy PostgreSQL integration
  • Conversation-based memory management
  • Metadata support for rich memory context
  • Bulk operations support
  • Automatic cleanup of old conversations

Installation

pip install vector-memory

Prerequisites

  1. PostgreSQL 11 or later
  2. pgvector extension

To install pgvector on your PostgreSQL instance:

CREATE EXTENSION vector;

Quick Start

from vector_memory import VectorMemory, PostgresVectorStorage
from openai import OpenAI

# Initialize connection
memory = VectorMemory.initialize(
    host="localhost",
    port=5432,
    database="your_db",
    user="your_user",
    password="your_password",
    openai_api_key="your-openai-key"
)

# Add a memory
memory.add_memory(
    conversation_id="user_123",
    content="The user prefers dark mode in applications."
)

# Retrieve relevant memories
relevant_memories = memory.get_relevant_memories(
    conversation_id="user_123",
    query="What are the user's UI preferences?",
    limit=5
)

# Print retrieved memories
for memory in relevant_memories:
    print(memory['content'])

Advanced Usage

With Metadata

# Add memory with metadata
memory.add_memory(
    conversation_id="user_123",
    content="User selected blue theme",
    metadata={
        "category": "preferences",
        "timestamp": "2024-01-18T10:30:00",
        "source": "settings_page"
    }
)

# Search with metadata filters
memories = memory.get_relevant_memories(
    conversation_id="user_123",
    query="What theme was selected?",
    filter_metadata={"category": "preferences"}
)

Bulk Operations

# Add multiple memories at once
contents = [
    "User opened dashboard",
    "User viewed reports",
    "User downloaded PDF"
]

metadata_list = [
    {"action": "view", "page": "dashboard"},
    {"action": "view", "page": "reports"},
    {"action": "download", "type": "pdf"}
]

memory.add_memories(
    conversation_id="user_123",
    contents=contents,
    metadata_list=metadata_list
)

Memory Management

# Delete a conversation
memory.delete_conversation("user_123")

# Clean up old conversations
memory.cleanup_old_conversations(days=30)

# Get conversation statistics
stats = memory.get_conversation_stats("user_123")
print(f"Total memories: {stats['total_memories']}")
print(f"Oldest memory: {stats['oldest_memory']}")

Connection Management

Basic Connection

# Using connection parameters
memory = VectorMemory.initialize(
    host="localhost",
    port=5432,
    database="your_db",
    user="your_user",
    password="your_password",
    openai_api_key="your-openai-key"
)

# Using connection string
memory = VectorMemory.initialize_from_string(
    "postgresql://user:password@localhost/dbname",
    openai_api_key="your-openai-key"
)

Connection Pool

# Initialize with connection pool
memory = VectorMemory.initialize(
    host="localhost",
    port=5432,
    database="your_db",
    user="your_user",
    password="your_password",
    openai_api_key="your-openai-key",
    pool_size=5,
    max_overflow=10
)

Configuration Options

memory = VectorMemory.initialize(
    host="localhost",
    port=5432,
    database="your_db",
    user="your_user",
    password="your_password",
    openai_api_key="your-openai-key",
    
    # PostgreSQL options
    pool_size=5,
    max_overflow=10,
    pool_timeout=30,
    
    # Memory options
    max_tokens=8191,         # Maximum tokens per content
    embedding_model="text-embedding-ada-002",  # OpenAI embedding model
)

Error Handling

from vector_memory.exceptions import StorageError

try:
    memory.add_memory(
        conversation_id="user_123",
        content="Some content"
    )
except StorageError as e:
    print(f"Failed to store memory: {e}")

Best Practices

  1. Use descriptive conversation IDs:
conversation_id = f"user_{user_id}_session_{session_id}"
  1. Include relevant metadata:
metadata = {
    "source": "chat",
    "timestamp": datetime.now().isoformat(),
    "user_role": "admin",
    "context": "support_ticket_123"
}
  1. Implement regular cleanup:
# Run periodically (e.g., daily)
memory.cleanup_old_conversations(days=30)
  1. Use appropriate similarity thresholds:
memories = memory.get_relevant_memories(
    conversation_id="user_123",
    query="search query",
    threshold=0.7  # Adjust based on your needs (0-1)
)

Performance Tips

  1. Use bulk operations for multiple memories:
memory.add_memories(conversation_id, contents, metadata_list)
  1. Index important metadata fields:
CREATE INDEX idx_memory_category ON vector_entries USING btree ((metadata->>'category'));
  1. Use connection pooling for high-traffic applications:
memory = VectorMemory.initialize(
    # ... connection details ...
    pool_size=10,
    max_overflow=20
)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mem_vault-0.1.0.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mem_vault-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file mem_vault-0.1.0.tar.gz.

File metadata

  • Download URL: mem_vault-0.1.0.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for mem_vault-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a90388f61d014a63aeb9f0420e224658a572cae4267866ed7a6dee2697d5161c
MD5 bca71a7248af6faacc794f3a766dafc3
BLAKE2b-256 7f14ce31c3e23642c456d5251220c726ff2ebca978172cf9f922296fba2761ed

See more details on using hashes here.

File details

Details for the file mem_vault-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mem_vault-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for mem_vault-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a905b9a0b0011ad75319a9648ba5c8d6f1b0918af42c352e78367c51ebba0fc9
MD5 74bacf288462708e57c4d509c6d1fdee
BLAKE2b-256 307dd70a9e5ce5b02cf0f3b9ac2c333a39491e7f8e6b9c9c6ed0ff063b0d90fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page