An intelligent, hierarchical context compression framework for LLM memory systems.

These details have not been verified by PyPI

Project description

CompactPy 🧠⚡

An intelligent, multi-evolutionary hierarchical memory and context compression framework designed to optimize LLM prompt footprints and eliminate token bloat in RAG pipelines.

🚀 The Core Problem

Large Language Models have finite, expensive context windows. Storing raw, repetitive conversational history, system clutter, and loose narrative prose directly in the prompt window leads to massive API billing inflation, elevated system latency, and model confusion due to key context dilution.

CompactPy solves this. By mimicking cognitive memory tiers, vector math similarities, and directed knowledge graphs, it drops prompt footprints by 40%+ while perfectly preserving deep engineering states and concept dependencies.

🛠️ Multi-Evolutionary Architecture

CompactPy processes raw runtime context streams across six specialized optimization phases:

1. Token Analytics Core (`compactpy.core`)

Uses high-speed BPE tokenization via tiktoken to run precision boundaries, calculating exact text lengths and tracking compression metrics down to individual bits.

2. Algorithmic Compression Engines (`compactpy.compressors`)

Exact Deduplication Engine: Automatically strips out repetitive context loops and chronological logs while keeping structural stream order intact.
Semantic Compressor: Embeds data blocks via SentenceTransformer, executing vector Cosine Similarity arrays to eliminate overlapping thoughts (e.g., keeping only one variation of a phrase if similarity crosses a 0.75 threshold).

3. Hierarchical Memory Repository (`compactpy.memory`)

Isolates text strings into explicit cognitive abstraction layers based on real-world utility:

raw_memory: The volatile, incoming execution log dump.
working_memory: Active short-term operational buffers available for immediate context retrieval.
long_term_memory: High-value project parameters and user rules that never decay.

4. Memory Scoring Engine (`compactpy.memory.scoring`)

Memories are evaluated dynamically using a custom, long-horizon linear performance formula:

Score = 0.4 × Importance + 0.3 × Utility + 0.2 × Frequency + 0.1 × Recency

High-scoring nodes are promoted straight to Long-Term Memory, medium nodes stay in Working storage, and low-scoring noise is automatically evicted to prevent token bloat.

5. Relational Graph Memory System (`compactpy.graph_memory`)

Converts raw long-term strings into dense, indexed, bidirectional Knowledge Graphs using NetworkX. Instead of raw prose, it stores knowledge as structured triplets:

Source Entity --(Relation)--> Target Entity

Example:

FastAPI --(backend_of)--> Mediscan AI

This retains complex causal relationships without wasting prompt space.

6. Attention-Aware Compressor (`compactpy.compressors.attention`)

Acts as a dynamic "Importance Predictor." When a user passes a live query, it calculates the attention weight of your history pool relative to that query, dynamically filling a targeted prompt token budget with the highest-relevance vectors.

💾 Installation

Install the production framework directly from PyPI:

pip install compactpy

💻 Quickstart: End-to-End Pipeline

Here is how to run the complete automated ingestion, scoring, and query-aware compaction loop:

from compactpy.memory import HierarchicalMemory
from compactpy.memory.scoring import MemoryScoringEngine
from compactpy.graph_memory import GraphMemorySystem
from compactpy.compressors.attention import AttentionAwareCompressor

# 1. Initialize our modular cognitive layers
memory_vault = HierarchicalMemory()
scoring_engine = MemoryScoringEngine()
graph_db = GraphMemorySystem()
attention_compressor = AttentionAwareCompressor()

# 2. Ingest raw conversational logs
raw_logs = [
    "We are designing a medicine detection module called Mediscan AI.",
    "Mediscan AI uses FastAPI for the backend framework architecture.",
    "Today the weather in Delhi is cloudy and rainy."
]

for log in raw_logs:
    importance = 0.85 if "FastAPI" in log or "Mediscan" in log else 0.3
    memory_vault.add_memory(log, importance=importance, utility=0.7)

# 3. Simulate usage hits and run lifecycle scoring
memory_vault.increment_frequency(raw_logs[1])
scoring_engine.process_lifecycle_cycle(memory_vault)

# 4. Map persistent facts into the knowledge graph
graph_db.add_relation("FastAPI", "backend_of", "Mediscan AI")
graph_facts = graph_db.get_relationships_as_text()

# 5. Build a query-aware compact context
user_query = "What backend options did we settle on for Mediscan AI?"
combined_context = [m["text"] for m in memory_vault.working_memory] + graph_facts

optimized_payload, metrics = attention_compressor.compress_context_for_query(
    query=user_query,
    context_pool=combined_context,
    token_budget=45
)

print(f"Optimized Prompt Context: {optimized_payload}")
print(f"Token Reduction: {metrics['reduction_percentage']}%")

🧪 Running Validation Demos

The project repository keeps runnable verification scripts under bin/. Run them to watch the math and optimization phases execute live in your terminal:

# Test token utilities and basic compressors
python bin/demo_phase1_foundations.py
python bin/demo_step2.py

# Test hierarchical lifecycle scoring loops
python bin/demo_step3.py

# Test graph relationship mapping
python bin/demo_step5.py

# Test dynamic attention query budgeting
python bin/demo_step6.py

# Run the complete end-to-end processing pipeline
python bin/run_compactpy_pipeline.py

📊 Performance Benchmarks

CompactPy scales robustly with dense context footprints. Below is the empirical efficiency evaluation demonstrating token reduction scaling against processing latency:

CompactPy Performance Curve

Token Optimization: Reaches up to 95%+ token space reduction under dense context scales by aggressively pruning semantic redundancies and noise.
Latency Footprint: Post-initialization, context filtration operates dynamically in under 50ms, ensuring real-world suitability for high-throughput LLM pipelines.

📄 License

Distributed under the MIT License. See LICENSE for more information.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.1

Jun 28, 2026

1.0.0

Jun 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

compactpy-1.0.1.tar.gz (21.7 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

compactpy-1.0.1-py3-none-any.whl (22.2 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file compactpy-1.0.1.tar.gz.

File metadata

Download URL: compactpy-1.0.1.tar.gz
Upload date: Jun 28, 2026
Size: 21.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for compactpy-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`d6ab082e067446452c46b4648a5728cadc0cf8eb8652f5e14e9ffe1915d193a6`
MD5	`c046adb604e7724efa700c18d34139ed`
BLAKE2b-256	`59e84fd5c9e23a238b2da746d8240eb8f83f90cd6b337078881ccedd15164609`

See more details on using hashes here.

File details

Details for the file compactpy-1.0.1-py3-none-any.whl.

File metadata

Download URL: compactpy-1.0.1-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 22.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for compactpy-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4e23548528d04b49973f75d1bc2a8b79f22ba0126ca9f1fff37c88c5964a83a3`
MD5	`fd68a208320303fadfba05a0b6e3956e`
BLAKE2b-256	`0dec083d0b2052e4578dfcecbca12504928f23a237f1960af4fb746c264892b8`

See more details on using hashes here.

compactpy 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

CompactPy 🧠⚡

🚀 The Core Problem

🛠️ Multi-Evolutionary Architecture

1. Token Analytics Core (`compactpy.core`)

2. Algorithmic Compression Engines (`compactpy.compressors`)

3. Hierarchical Memory Repository (`compactpy.memory`)

4. Memory Scoring Engine (`compactpy.memory.scoring`)

5. Relational Graph Memory System (`compactpy.graph_memory`)

6. Attention-Aware Compressor (`compactpy.compressors.attention`)

💾 Installation

💻 Quickstart: End-to-End Pipeline

🧪 Running Validation Demos

📊 Performance Benchmarks

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

compactpy 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

CompactPy 🧠⚡

🚀 The Core Problem

🛠️ Multi-Evolutionary Architecture

1. Token Analytics Core (compactpy.core)

2. Algorithmic Compression Engines (compactpy.compressors)

3. Hierarchical Memory Repository (compactpy.memory)

4. Memory Scoring Engine (compactpy.memory.scoring)

5. Relational Graph Memory System (compactpy.graph_memory)

6. Attention-Aware Compressor (compactpy.compressors.attention)

💾 Installation

💻 Quickstart: End-to-End Pipeline

🧪 Running Validation Demos

📊 Performance Benchmarks

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1. Token Analytics Core (`compactpy.core`)

2. Algorithmic Compression Engines (`compactpy.compressors`)

3. Hierarchical Memory Repository (`compactpy.memory`)

4. Memory Scoring Engine (`compactpy.memory.scoring`)

5. Relational Graph Memory System (`compactpy.graph_memory`)

6. Attention-Aware Compressor (`compactpy.compressors.attention`)