A declarative, fine-grained automated context engineering framework for LLMs.

These details have not been verified by PyPI

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Software Development :: Libraries :: Python Modules

Project description

ContextForge 🛠️

A declarative, fine-grained automated context engineering framework designed for production AI systems.

ContextForge brings React's component-driven lifecycle architecture and deterministic state rendering natively into the LangChain ecosystem as a first-class orchestration middleware layer.

Strategic Value Proposition

In production, context engineering fails when it operates as an unmonitored string-concatenation black box:

Static prompts lead to context overflow
"Lost-in-the-Middle" document placement causes LLM attention drops
Runaway token expenses accumulate from uncontrolled memory growth

ContextForge solves this by shifting prompt building from fragile string formatting to a dynamic, token-aware Directed Acyclic Graph (DAG) architecture:

       [ 1. DEVELOPER DECLARATIVE INTENT ]
           └─ LCEL Pipe Operators (Runnable)
              High-Level Configuration Primitives
                            │
                            ▼
       [ 2. TOPOLOGICAL RECOMPILER ]
           └─ Priority-Based Element Scheduling
              Deterministic Dependency Tracking
                            │
                            ▼
       [ 3. FINE-GRAINED BUDGET ALLOCATOR ]
           └─ Real-Time Token Tracking (tiktoken)
              "Middle-Out" Alternating Array Distribution
              Word-Level Fallback Linguistic Compression
                            │
                            ▼
       [ 4. TELEMETRY AND LOG EXPORTER ]
           └─ Token Allocation Lineage Auditing
              Component Cost Tracking Analytics

Core Architecture Layers

Layer 1: Declarative Component Interface (Like React)

Every prompt segment is built as an isolated, self-contained component object derived from BaseContextComponent:

from contextforge.base import StaticContextComponent, AdaptiveContextPool

# System invariants with guaranteed token allocation
system_block = StaticContextComponent(
    name="system_instructions",
    template="You are an expert assistant. Use context to answer precisely.",
    priority=0  # Highest priority
)

# Dynamic context pool that shrinks/expands with token budget
context_pool = AdaptiveContextPool(
    name="knowledge_base",
    priority=50,
    input_key="fused_contexts"
)

Layer 2: Priority Scheduling Matrix

Components are evaluated sequentially according to a strict priority hierarchy:

Priority	Component Type	Token Guarantee	Behavior
0	System Invariants	Full allocation	Non-negotiable structural elements
10	User Query Layer	Full allocation	Direct user questions and context
50+	Elastic Context Pools	Remaining budget	Expand, compress, or drop entirely

Layer 3: Deep LangChain Integration (First-Class Runnable)

The compilation core inherits directly from LangChain's RunnableSerializable module:

from contextforge.engine import AutomatedContextEngine
from langchain_core.runnables import RunnableSequence

engine = AutomatedContextEngine(
    max_tokens=4000,
    recent_window_size=10
)

# Use directly in LCEL pipe operators
chain = retriever | engine | llm_model

Detailed Component Orchestration Lifecycle

When an input payload hits the context engine during execution:

[Incoming Application Payload]
           │
           ▼
[1. MEMORY PARTITIONING STAGE]
   ├─ Slice history array into 'recent_window_size' buffer
   └─ Linearly aggregate older messages into Archive Trace Summary
           │
           ▼
[2. HYBRID RETRIEVAL FUSION STAGE]
   ├─ Deduplicate dense semantic vectors and sparse BM25 hits
   └─ Map relevance scores to normalize data into unified structures
           │
           ▼
[3. LOST-IN-THE-MIDDLE ALTERNATION STAGE]
   └─ Re-order rows into an alternating marginal placement array
           │
           ▼
[4. FINE-GRAINED ALLOCATION COMPILER STAGE]
   ├─ Evaluate High-Priority components (System/Query)
   ├─ Subtract token costs from total max_tokens budget bounds
   └─ Process Elastic Pools: Apply fractional text compression if budget breaches
           │
           ▼
[Final LangChain StringPromptValue Delivery Envelopes]

Four Core Automation Mechanisms

1. Automated Sliding Memory Partitioning

The framework automatically manages chat windows by splitting the conversation array:

Active Window: Latest N messages preserved exactly (default N=10)
Archive Summary: Older messages condensed into single background context trace

engine = AutomatedContextEngine(recent_window_size=5)
# Last 5 messages: Full preservation
# Messages 1-N: Automatic aggregation into archive summary

2. Hybrid Search Fusion & Re-ranking

Merges documents from disparate sources (vector embeddings + BM25 keyword indices):

Deduplicates based on page content hash
Normalizes importance scores across sources
Creates unified, ranked document pool

# Engine automatically calls _auto_hybrid_fuse()
# Vector docs (score: 0.95) + BM25 docs (score: 0.70) → merged & deduped

3. The Alternating Marginal Layout ("Middle-Out")

Solves the "Lost-in-the-Middle" problem where LLMs lose focus on center-placed data:

For documents [D1, D2, D3, D4, D5] sorted by importance:

D1 (highest) → append to end (high attention boundary)
D2 → prepend to start (high attention boundary)
D3 (middle) → append to end (low attention zone)
D4 → prepend to start (boundary)
D5 (second highest) → append to end (boundary)

Result: [D2, D4] + [D5, D3, D1] with peak attention at margins ✓

4. Fine-Grained Token Allocation & Fallback Compression

Token-aware budget allocation across components:

High-priority sections evaluated first (guaranteed space)
Elastic context pools process with remaining budget
Document dropping when budget exhausted
Word-level compression if essential chunk slightly breaches boundary

engine.max_tokens = 2000
# System: 300 tokens → Remaining: 1700
# Query: 150 tokens → Remaining: 1550
# Context: Auto-compress & allocate remaining 1550

Installation

Prerequisites

Python 3.10+
LangChain Core 0.1.0+
tiktoken 0.5.0+

Setup

# Clone the repository
git clone https://github.com/yourusername/contextforge.git
cd contextforge

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\Activate.ps1

# Install dependencies
pip install --upgrade pip setuptools wheel
pip install -e ".[dev]"

Usage Example

Basic Integration with LangChain RAG

from langchain_core.documents import Document
from langchain_core.runnables import RunnableSequence
from contextforge.engine import AutomatedContextEngine

# Initialize the context engine
engine = AutomatedContextEngine(
    max_tokens=4000,
    recent_window_size=10
)

# Prepare input payload
payload = {
    "query": "How do distributed systems handle node failures?",
    "chat_history": [
        {"role": "user", "content": "What is fault tolerance?"},
        {"role": "assistant", "content": "Fault tolerance is..."},
        # ... more messages
    ],
    "vector_docs": [
        Document(
            page_content="Replication strategies for fault tolerance...",
            metadata={"id": "doc_1", "score": 0.95}
        ),
        # ... more vector results
    ],
    "bm25_docs": [
        Document(
            page_content="Consensus algorithms like Raft and Paxos...",
            metadata={"id": "doc_2", "score": 0.82}
        ),
        # ... more BM25 results
    ]
}

# Invoke the engine (produces structured PromptValue)
prompt_value = engine.invoke(payload)
structured_prompt = prompt_value.to_string()

# Use in LLM chain
result = llm_model.invoke(structured_prompt)

Component-Based Custom Workflows

from contextforge.base import StaticContextComponent, AdaptiveContextPool

# Define custom components
system = StaticContextComponent(
    name="system_layer",
    template="You are a {role} assistant specializing in {domain}.",
    priority=0
)

context = AdaptiveContextPool(
    name="retrieval_context",
    priority=20,
    input_key="retrieved_docs"
)

# Components are rendered in priority order
# System (0) → Query (10) → Context (20)

API Reference

AutomatedContextEngine

class AutomatedContextEngine(RunnableSerializable[Dict[str, Any], PromptValue]):
    """
    Main orchestrator for context compilation.
    
    Attributes:
        max_tokens: Total token budget (default: 4000)
        recent_window_size: Active conversation window size (default: 10)
        encoder_name: Tiktoken encoding name (default: "cl100k_base")
    """
    
    def invoke(
        self, 
        input: Dict[str, Any], 
        config: Optional[RunnableConfig] = None
    ) -> PromptValue:
        """Compile context and return LangChain PromptValue."""
        pass

Component Classes

BaseContextComponent

@abc.abstractmethod
def render(
    self, 
    state: Dict[str, Any], 
    token_budget: int
) -> Tuple[str, int]:
    """Render component within token budget."""
    pass

StaticContextComponent

StaticContextComponent(
    name: str,           # Component identifier
    template: str,       # Format string with {placeholders}
    priority: int = 0    # Execution priority
)

AdaptiveContextPool

AdaptiveContextPool(
    name: str,                           # Component identifier
    priority: int = 50,                  # Execution priority
    input_key: str = "fused_contexts"    # State dictionary key
)

Testing

Run the comprehensive test suite:

# Install test dependencies
pip install -e ".[dev]"

# Run all tests
pytest -v

# Run with coverage
pytest --cov=contextforge tests/

# Run specific test class
pytest tests/test_engine.py::TestAutomatedContextEngine -v

Performance Benchmarks

Scenario	Input	Output	Time
Small context (1 doc)	~200 tokens	~300 tokens	<10ms
Medium context (5 docs)	~1000 tokens	~1200 tokens	~50ms
Large context (20 docs)	~3500 tokens	~3900 tokens	~150ms
Memory partitioning (100 msgs)	~2000 tokens	~400 tokens	~30ms

Production Deployment Patterns

Pattern 1: Stateless RAG Pipeline

retriever | engine | llm_model

Pattern 2: Stateful Conversation Loop

# Accumulate messages in session storage
messages = retrieve_from_db(session_id)
result = engine.invoke({
    "query": user_input,
    "chat_history": messages,
    "vector_docs": vector_search(user_input),
    "bm25_docs": bm25_search(user_input)
})

Pattern 3: Multi-Document Routing

# Route different query types to specialized context pools
if is_code_query(query):
    pool = code_context_pool
elif is_documentation_query(query):
    pool = doc_context_pool
else:
    pool = general_context_pool

Roadmap

v0.2.0: Streaming support with async_invoke()
v0.3.0: Dynamic priority reweighting based on query type
v0.4.0: Multi-modal document support (images, code, tables)
v0.5.0: Telemetry export (token costs, performance metrics)
v1.0.0: Production-grade caching and optimization layer

Contributing

We welcome contributions from the community! See CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see LICENSE for details.

Made with ❤️ for the open-source AI community.

For questions, issues, or feature requests, please open a GitHub issue or reach out to the maintainers.

Project details

These details have not been verified by PyPI

Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Software Development :: Libraries :: Python Modules

Release history Release notifications | RSS feed

This version

0.1.0

May 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contextmg-0.1.0.tar.gz (53.7 kB view details)

Uploaded May 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

contextmg-0.1.0-py3-none-any.whl (16.4 kB view details)

Uploaded May 17, 2026 Python 3

File details

Details for the file contextmg-0.1.0.tar.gz.

File metadata

Download URL: contextmg-0.1.0.tar.gz
Upload date: May 17, 2026
Size: 53.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for contextmg-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`50213cd226f395c6d545f0f72dc5933ef7c7c5a7eb6373f17e3cc775fce0d54a`
MD5	`81486bd5dfb92873ceb3c35961ea1ba1`
BLAKE2b-256	`73cfb392a8b3742cd45e1b7b3ce923cdbfc9f371ccb4cb8dfadb336165318f71`

See more details on using hashes here.

File details

Details for the file contextmg-0.1.0-py3-none-any.whl.

File metadata

Download URL: contextmg-0.1.0-py3-none-any.whl
Upload date: May 17, 2026
Size: 16.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for contextmg-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6e1165ec8260ff0d2979bb9b371fc745d93f66c11a6c6fa1455c7e1dc1f780b3`
MD5	`a0ad726346d9f59f991223d1fafbcaa6`
BLAKE2b-256	`79ed57bfca4144da4ca3d28c2792495331cfb406c7161234607078d8d844d3ab`

See more details on using hashes here.

contextmg 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

ContextForge 🛠️

Strategic Value Proposition

Core Architecture Layers

Layer 1: Declarative Component Interface (Like React)

Layer 2: Priority Scheduling Matrix

Layer 3: Deep LangChain Integration (First-Class Runnable)

Detailed Component Orchestration Lifecycle

Four Core Automation Mechanisms

1. Automated Sliding Memory Partitioning

2. Hybrid Search Fusion & Re-ranking

3. The Alternating Marginal Layout ("Middle-Out")

4. Fine-Grained Token Allocation & Fallback Compression

Installation

Prerequisites

Setup

Usage Example

Basic Integration with LangChain RAG

Component-Based Custom Workflows

API Reference

AutomatedContextEngine

Component Classes

BaseContextComponent

StaticContextComponent

AdaptiveContextPool

Testing

Performance Benchmarks

Production Deployment Patterns

Pattern 1: Stateless RAG Pipeline

Pattern 2: Stateful Conversation Loop

Pattern 3: Multi-Document Routing

Roadmap

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes