Skip to main content

Context engineering toolkit for ranking, packing, and risk-scanning RAG context. Python port of @mukundakatta/context-forge.

Project description

context-forge-py

PyPI Python License: MIT

Context engineering toolkit for RAG and agent prompts. Chunk documents, score relevance (BM25), apply diversity-aware packing (MMR), flag prompt-injection risk, and emit citation-ready context blocks that fit a token budget. Zero runtime dependencies.

Python port of @mukundakatta/context-forge.

Install

pip install context-forge-py

Quick start

from context_forge import forge

result = forge(
    chunks=[
        {"id": "doc1#0", "text": "Pluto was reclassified as a dwarf planet in 2006.", "source": "wiki/pluto"},
        {"id": "doc2#0", "text": "Mars is the fourth planet from the Sun.", "source": "wiki/mars"},
    ],
    query="What happened to Pluto?",
    budget=200,
)

result.blocks      # list of kept chunks (ranked, diversified, packed)
result.used_tokens # int -- tokens used by kept blocks
result.dropped     # list of {id, reason}
result.risks       # injection findings (e.g. ignore_instructions, exfil_curl)
result.citations   # {block_id: {source, span: [start, end]}}

You can also use the lower-level pipeline and the document-level entry:

from context_forge import (
    pack_context,    # full pipeline: chunk -> score -> diversify -> risk-scan -> pack
    chunk_document,  # split a document into token-bounded chunks
    score_chunks,    # BM25 score against a query
    diversify,       # MMR re-rank for diversity
    pack_to_budget,  # greedy budget packer
    scan_injection,  # prompt-injection risk scan
    estimate_tokens, # ceil(len/4) heuristic
)

API

forge(chunks, query, budget=1200, *, lambda_=0.7, per_chunk_min=20)

Compose ranking + diversification + injection-scanning + budget-packing on a list of pre-chunked context. Returns a ForgedContext dataclass with blocks, used_tokens, dropped, risks, and citations.

pack_context(query, documents, budget_tokens=1200, max_tokens=200, overlap_tokens=20, lambda_=0.7, per_chunk_min=20)

Full pipeline starting from raw documents ({"id": ..., "text": ..., "source": ...}). Chunks each document, then runs the same score/diversify/scan/pack steps as forge.

Risk classes

scan_injection(text) returns findings with kind, severity, snippet, and index. Detected kinds include ignore_instructions, system_prefix, you_are_now, role_tag, exfil_curl, exfil_wget, exfil_base64, zero_width_char, and suspicious_url.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_forge_py-0.1.0.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_forge_py-0.1.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file context_forge_py-0.1.0.tar.gz.

File metadata

  • Download URL: context_forge_py-0.1.0.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for context_forge_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2b981ea71148b2818d198b9c16e07616e7ab22dfc5de8d6398dd6f4929d37451
MD5 8be0562df802e474c323b3a329d225af
BLAKE2b-256 b9b6e840d367832190fe876bdec24d59d9c4338e2493390021fb23988f1fe41b

See more details on using hashes here.

File details

Details for the file context_forge_py-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for context_forge_py-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dad20dbb348e7ec97a130d48730de27e49b917d4ae85cd61e1155b2729378330
MD5 26ed630a8f3e51ca951967ee0c64d38d
BLAKE2b-256 46c994e93a614b9a7374ab78a34cc40c7ca85599675231ab1cd46fe7c2440c4d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page