Context engineering toolkit for ranking, packing, and risk-scanning RAG context. Python port of @mukundakatta/context-forge.
Project description
context-forge-py
Context engineering toolkit for RAG and agent prompts. Chunk documents, score relevance (BM25), apply diversity-aware packing (MMR), flag prompt-injection risk, and emit citation-ready context blocks that fit a token budget. Zero runtime dependencies.
Python port of @mukundakatta/context-forge.
Install
pip install context-forge-py
Quick start
from context_forge import forge
result = forge(
chunks=[
{"id": "doc1#0", "text": "Pluto was reclassified as a dwarf planet in 2006.", "source": "wiki/pluto"},
{"id": "doc2#0", "text": "Mars is the fourth planet from the Sun.", "source": "wiki/mars"},
],
query="What happened to Pluto?",
budget=200,
)
result.blocks # list of kept chunks (ranked, diversified, packed)
result.used_tokens # int -- tokens used by kept blocks
result.dropped # list of {id, reason}
result.risks # injection findings (e.g. ignore_instructions, exfil_curl)
result.citations # {block_id: {source, span: [start, end]}}
You can also use the lower-level pipeline and the document-level entry:
from context_forge import (
pack_context, # full pipeline: chunk -> score -> diversify -> risk-scan -> pack
chunk_document, # split a document into token-bounded chunks
score_chunks, # BM25 score against a query
diversify, # MMR re-rank for diversity
pack_to_budget, # greedy budget packer
scan_injection, # prompt-injection risk scan
estimate_tokens, # ceil(len/4) heuristic
)
API
forge(chunks, query, budget=1200, *, lambda_=0.7, per_chunk_min=20)
Compose ranking + diversification + injection-scanning + budget-packing on a list of pre-chunked context. Returns a ForgedContext dataclass with blocks, used_tokens, dropped, risks, and citations.
pack_context(query, documents, budget_tokens=1200, max_tokens=200, overlap_tokens=20, lambda_=0.7, per_chunk_min=20)
Full pipeline starting from raw documents ({"id": ..., "text": ..., "source": ...}). Chunks each document, then runs the same score/diversify/scan/pack steps as forge.
Risk classes
scan_injection(text) returns findings with kind, severity, snippet, and index. Detected kinds include ignore_instructions, system_prefix, you_are_now, role_tag, exfil_curl, exfil_wget, exfil_base64, zero_width_char, and suspicious_url.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context_forge_py-0.1.0.tar.gz.
File metadata
- Download URL: context_forge_py-0.1.0.tar.gz
- Upload date:
- Size: 11.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b981ea71148b2818d198b9c16e07616e7ab22dfc5de8d6398dd6f4929d37451
|
|
| MD5 |
8be0562df802e474c323b3a329d225af
|
|
| BLAKE2b-256 |
b9b6e840d367832190fe876bdec24d59d9c4338e2493390021fb23988f1fe41b
|
File details
Details for the file context_forge_py-0.1.0-py3-none-any.whl.
File metadata
- Download URL: context_forge_py-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dad20dbb348e7ec97a130d48730de27e49b917d4ae85cd61e1155b2729378330
|
|
| MD5 |
26ed630a8f3e51ca951967ee0c64d38d
|
|
| BLAKE2b-256 |
46c994e93a614b9a7374ab78a34cc40c7ca85599675231ab1cd46fe7c2440c4d
|