Advanced Knowledge Graph Engine with semantic search and temporal tracking

These details have not been verified by PyPI

Project links

Project description

Knowledge Graph Engine v2

Modern Neo4j-based knowledge graph engine with semantic search capabilities, intelligent relationship management, and performance optimizations.

🎯 Overview

A production-ready knowledge graph system built entirely on Neo4j for persistent graph storage and vector search. Combines graph database operations with semantic vector search to provide intelligent information storage, retrieval, and reasoning.

✨ Key Features

🏗️ Neo4j-Native Architecture: Complete Neo4j integration for both graph and vector operations
🔍 Enhanced Semantic Search: Improved vector search with dynamic thresholds and contextual boosting
🤖 LLM Integration: OpenAI/Ollama support for entity extraction and query processing
⚔️ Conflict Resolution: Intelligent handling of contradicting information with temporal tracking
⏰ Temporal Tracking: Complete relationship history with date ranges and conflict resolution
🎯 Smart Query Understanding: Context-aware search with semantic category matching
📊 Optimized Performance: 50-74% faster queries with smart caching and lazy loading
🚀 Production Ready: ACID compliance, comprehensive error handling, modern architecture

🆕 New in v2.1.0

⚡ Performance Optimizations: GraphQueryOptimizer and Neo4jOptimizer for 50-74% faster queries
💾 Smart Caching: Query result caching with 5-minute TTL for near-instant repeated queries
🔧 Refactored GraphEdge: Lazy loading with safe accessors, 18% smaller codebase
🛠️ Dynamic Relationships: WORKS_AT, LIVES_IN instead of generic RELATES_TO
🐛 Bug Fixes: Fixed "Relationship not populated" errors, enhanced source filtering

📁 Project Structure

src/                                  # Main source directory
├── kg_engine/                        # Knowledge Graph Engine
│   ├── core/                         # Core engine
│   │   └── engine.py                 # Main KG Engine
│   ├── models/                       # Data models
│   │   └── models.py                 # Graph data structures
│   ├── storage/                      # Storage components
│   │   ├── graph_db.py               # Neo4j graph operations
│   │   ├── neo4j_vector_store.py     # Vector storage
│   │   ├── vector_store.py           # Vector store interface
│   │   └── ...                       # Other storage components
│   ├── llm/                          # LLM integration
│   │   └── llm_interface.py          # OpenAI/Ollama interface
│   ├── config/                       # Configuration
│   │   ├── neo4j_config.py           # Neo4j settings
│   │   └── neo4j_schema.py           # Schema management
│   └── utils/                        # Utilities
│       ├── date_parser.py            # Date parsing
│       ├── graph_query_optimizer.py  # Query optimization
│       ├── neo4j_optimizer.py        # Neo4j optimizations
│       └── ...                       # Other utilities
├── examples/                         # Usage examples
│   ├── examples.py                   # Basic examples
│   ├── bio_example.py                # Biographical demo
│   └── simple_bio_demo.py            # Simple demo
└── test_neo4j_integration.py         # Test suite

docs/                                 # Comprehensive documentation
├── architecture/                     # System design
├── user-guide/                       # Getting started
├── api/                              # API reference
└── development/                      # Development guides

🚀 Quick Start

Prerequisites

# Install Neo4j (required)
docker run --name neo4j -p7474:7474 -p7687:7687 -d \
    -e NEO4J_AUTH=neo4j/password \
    neo4j:latest

Installation

pip install -e .

Basic Usage

from src.kg_engine import KnowledgeGraphEngineV2, InputItem
from src.kg_engine.config import Neo4jConfig

# Initialize with Neo4j
engine = KnowledgeGraphEngineV2(
    api_key="your-openai-key",  # or "ollama" for local LLM
    neo4j_config=Neo4jConfig()
)

# Add knowledge
result = engine.process_input([
    InputItem(description="Alice works as a software engineer at Google"),
    InputItem(description="Bob lives in San Francisco")
])

# Search with natural language
response = engine.search("Who works at Google?")
print(response.answer)  # "Alice works as a software engineer at Google."

🤖 LLM Setup Options

Option 1: OpenAI (Recommended for Production)

export OPENAI_API_KEY="your-api-key"

engine = KnowledgeGraphEngineV2(
    api_key="your-openai-key",
    model="gpt-4.1-nano"  # Fast and cost-effective
)

Option 2: Local Ollama (Privacy & Cost-Free)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Start server
ollama serve

# Pull a model
ollama pull llama3.2:3b  # Recommended: good balance of size/performance

engine = KnowledgeGraphEngineV2(
    api_key="ollama",
    base_url="http://localhost:11434/v1",
    model="llama3.2:3b"
)

🏗️ Optimized Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   LLM Interface │    │   Graph Database │    │  Vector Store   │
│                 │    │                  │    │                 │
│ • Entity Extract│    │ • Neo4j Native   │    │ • Neo4j Vectors │
│ • Query Parse   │    │ • Query Cache    │    │ • Semantic      │
│ • Answer Gen.   │    │ • Optimizations  │    │ • Search        │
└─────────────────┘    └──────────────────┘    └─────────────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
                                 │
                    ┌─────────────────────┐
                    │ KG Engine v2        │
                    │  (Optimized)        │
                    │                     │
                    │ • Process Input     │
                    │ • Smart Updates     │
                    │ • Hybrid Search     │
                    │ • Query Caching     │
                    │ • Safe Accessors    │
                    └─────────────────────┘

📊 Advanced Features

Intelligent Conflict Resolution

# Initial information
engine.process_input([InputItem(description="Alice lives in Boston")])

# Update with conflicting information (automatically resolves)
engine.process_input([InputItem(description="Alice moved to Seattle in 2024")])

# System automatically:
# 1. Marks old relationship as obsolete
# 2. Adds new relationship as active
# 3. Maintains complete history

Optimized Search Performance

# Fast cached queries (< 1ms for repeated searches)
response = engine.search("Who works in technology?")  # First call: ~100ms
response = engine.search("Who works in technology?")  # Cached: < 1ms

# Enhanced semantic understanding with contextual boosting
response = engine.search("Who was born in Europe?")
# ✅ Returns all European births: Berlin, Lyon, Barcelona, Paris

# Safe relationship access (no more "Relationship not populated" errors)
for result in response.results:
    edge = result.triplet.edge
    subject = edge.get_subject_safe()  # Safe accessor
    relationship = edge.get_relationship_safe()  # Safe accessor
    obj = edge.get_object_safe()  # Safe accessor

Temporal Relationship Tracking

# Natural language dates
engine.process_input([
    InputItem(description="Project started", from_date="2 months ago"),
    InputItem(description="Alice joined", from_date="last week")
])

📚 Documentation

📖 Quick Start: Get running in 5 minutes
🏗️ Architecture: System design and components
📊 Workflows: Process flows with diagrams
🔧 API Reference: Complete API documentation
👩‍💻 Development: Development setup and guidelines

🚦 Running Examples

# Run basic examples
python src/examples/examples.py

# Run biographical knowledge graph demo  
python src/examples/simple_bio_demo.py

# Verify project structure
python verify_structure.py

Expected output:

✅ Neo4j connection verified
🚀 Knowledge Graph Engine v2 initialized
   - Vector store: kg_v2 (neo4j)
   - Graph database: Neo4j (persistent)
   
=== Example: Semantic Relationship Handling ===
1. Adding: John Smith teaches at MIT
   Result: 1 new edge(s) created
...

🔍 Search Capabilities

The Knowledge Graph Engine v2 features advanced semantic search with:

Performance Optimizations: Query caching, lazy loading, and optimized Cypher queries
Dynamic Similarity Thresholds: Base threshold of 0.3 with context-specific adjustments
Semantic Category Matching: Understands relationships between concepts (e.g., "technology" → "software engineer")
Query-Specific Boosting: Different query types get tailored relevance scoring
Geographic Intelligence: Recognizes European cities and other geographic relationships
Safe Data Access: Robust error handling with safe accessor methods

Example Queries

# Technology and profession queries
"Who works in technology?" → Finds software engineers, developers, tech professionals
"Tell me about engineers" → Returns all engineering-related professions

# Geographic queries  
"Who was born in Europe?" → Finds Berlin, Lyon, Barcelona, Paris births
"Who lives in Paris?" → Returns all Paris residents

# Activity and interest queries
"What do people do for hobbies?" → Returns all "enjoys" relationships
"Tell me about photographers" → Finds people who enjoy or specialize in photography

# Entity-specific queries
"Tell me about Emma Johnson" → Returns all relationships for Emma

🧪 Testing

Run the comprehensive test suite:

# Core integration tests
python test_neo4j_integration.py

# Performance optimization tests
python test_optimizations.py

# Relationship fix validation
python test_relationship_fix.py

# Quick validation
python test_quick_relationship_fix.py

📈 Performance Benchmarks

Operation	Before Optimization	After Optimization	Improvement
Entity Exploration	20-50ms	8-15ms	~60% faster
Vector Search	100-200ms	40-80ms	~50% faster
Conflict Detection	150-300ms	50-100ms	~67% faster
Path Finding	80-160ms	25-50ms	~70% faster
Cached Queries	N/A	< 1ms	Near-instant

🔧 Development

For development setup and contributing guidelines, see docs/development/README.md.

Key Implementation Details

# Safe edge property access
edge = result.triplet.edge
if edge.has_graph_data():
    subject, relationship, obj = edge.get_graph_data()
else:
    subject = edge.get_subject_safe() or "Unknown"
    relationship = edge.get_relationship_safe() or "Unknown"
    obj = edge.get_object_safe() or "Unknown"

# Optimized queries with caching
cache_key = f"entity_exploration_{entity_name}"
if cached_result := self.graph_db._get_cache(cache_key):
    return cached_result
    
result = self.graph_db.get_entity_relationships_optimized(entity_name)
self.graph_db._set_cache(cache_key, result)

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.3.1

Jul 26, 2025

2.3.0

Jul 26, 2025

This version

2.2.2

Jul 24, 2025

2.2.1

Jul 24, 2025

2.2.0

Jul 24, 2025

2.1.4

Jul 23, 2025

2.1.3

Jul 22, 2025

2.1.2

Jul 22, 2025

2.1.1

Jul 22, 2025

2.1.0

Jul 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kg_engine_v2-2.2.2.tar.gz (108.1 kB view details)

Uploaded Jul 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kg_engine_v2-2.2.2-py3-none-any.whl (72.3 kB view details)

Uploaded Jul 24, 2025 Python 3

File details

Details for the file kg_engine_v2-2.2.2.tar.gz.

File metadata

Download URL: kg_engine_v2-2.2.2.tar.gz
Upload date: Jul 24, 2025
Size: 108.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for kg_engine_v2-2.2.2.tar.gz
Algorithm	Hash digest
SHA256	`f76539231ddd0a9a38862c3053dbd449a8388346720619acebbd16892577a32c`
MD5	`cf96f11e962def4c7d8ebdbc72198838`
BLAKE2b-256	`49ee931cdd0f84a0a9d646afbb47e6746299d8720f46f1bab7b67e7364866ecb`

See more details on using hashes here.

File details

Details for the file kg_engine_v2-2.2.2-py3-none-any.whl.

File metadata

Download URL: kg_engine_v2-2.2.2-py3-none-any.whl
Upload date: Jul 24, 2025
Size: 72.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for kg_engine_v2-2.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`33898782502a33d89f8d5d4443a061948dedc1590b48674f0300dff473dd4f61`
MD5	`8fb697bd9a0cacef1cc7448d536aae6c`
BLAKE2b-256	`66a89269b12d01324f31327e0bb6d09b26a0e9d9f584b0467e45667a70b99b4b`

See more details on using hashes here.

kg-engine-v2 2.2.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Knowledge Graph Engine v2

🎯 Overview

✨ Key Features

🆕 New in v2.1.0

📁 Project Structure

🚀 Quick Start

Prerequisites

Installation

Basic Usage

🤖 LLM Setup Options

Option 1: OpenAI (Recommended for Production)

Option 2: Local Ollama (Privacy & Cost-Free)

🏗️ Optimized Architecture

📊 Advanced Features

Intelligent Conflict Resolution

Optimized Search Performance

Temporal Relationship Tracking

📚 Documentation

🚦 Running Examples

🔍 Search Capabilities

Example Queries

🧪 Testing

📈 Performance Benchmarks

🔧 Development

Key Implementation Details

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes