Build perfect LLM context from your Python codebase — automatically.

These details have not been verified by PyPI

Project links

Project description

ctxeng — Python Context Engineering Library

Stop copy-pasting files into ChatGPT.
Build the perfect LLM context from your codebase, automatically.

License Downloads

Context engineering is the new prompt engineering. The quality of your LLM's output depends almost entirely on what you put in the context window — not how you phrase the question.

ctxeng solves this automatically:

Scans your codebase and scores every file for relevance to your query
Ranks by signal — keyword overlap, AST symbols, git recency, import graph
Fits the budget — smart truncation keeps the best parts within any model's token limit
Ships ready to paste — XML, Markdown, or plain text output that works with Claude, GPT-4o, Gemini, and every other model

Zero required dependencies. Works with any LLM.

Installation

pip install ctxeng

For accurate token counting (strongly recommended):

pip install "ctxeng[tiktoken]"

For one-line LLM calls:

pip install "ctxeng[anthropic]"    # Claude
pip install "ctxeng[openai]"       # GPT-4o
pip install "ctxeng[all]"          # everything

Quickstart

Python API

from ctxeng import ContextEngine

engine = ContextEngine(root=".", model="claude-sonnet-4")
ctx = engine.build("Fix the authentication bug in the login flow")

print(ctx.summary())
# Context summary (12,340 tokens / 197,440 budget):
#   Included : 8 files
#   Skipped  : 23 files (over budget)
#   [████████  ] 0.84  src/auth/login.py
#   [███████   ] 0.71  src/auth/middleware.py
#   [█████     ] 0.53  src/models/user.py
#   [████      ] 0.41  tests/test_auth.py
#   ...

# Paste directly into your LLM
print(ctx.to_string())

Fluent Builder API

from ctxeng import ContextBuilder

ctx = (
    ContextBuilder(root=".")
    .for_model("gpt-4o")
    .only("**/*.py")
    .exclude("tests/**", "migrations/**")
    .from_git_diff()                        # only changed files
    .with_system("You are a senior Python engineer. Be concise.")
    .build("Refactor the payment module to use async/await")
)

print(ctx.to_string("markdown"))

One-line LLM call

from ctxeng import ContextEngine
from ctxeng.integrations import ask_claude

engine = ContextEngine(".", model="claude-sonnet-4")
ctx = engine.build("Why is the test_login test failing?")

response = ask_claude(ctx)
print(response)

CLI

# Build context for a query and print to stdout
ctxeng build "Fix the auth bug"

# Focused on git-changed files only
ctxeng build "Review my changes" --git-diff

# Target a specific model with markdown output
ctxeng build "Refactor this" --model gpt-4o --fmt markdown

# Save to file
ctxeng build "Explain the payment flow" --output context.md

# Project stats
ctxeng info

How It Works

Your codebase                    ctxeng                      Your LLM
─────────────                ────────────────            ────────────────
src/auth/login.py  ─┐
src/models/user.py ─┤  1. Score files         2. Fit budget     <context>
src/api/routes.py  ─┼─► vs query + git  ─►   smart truncate ─► <file>...</file>
tests/test_auth.py ─┤     recency + AST        token-aware       <file>...</file>
...500 more files  ─┘                                           </context>

Scoring signals

Each file gets a relevance score from 0 → 1, combining:

Signal	What it measures
Keyword overlap	How many query terms appear in the file content
AST symbols	Class/function/import names that match the query (Python)
Path relevance	Filename and directory names matching query tokens
Git recency	Files touched in recent commits score higher

Token budget optimization

Files are ranked by score and filled greedily into the token budget. Files that don't fit are smart-truncated (head + tail, never middle) rather than dropped entirely — the top of a file has imports and class defs; the tail has recent changes. Both are high-signal.

Examples

Debug a failing test

from ctxeng import ContextBuilder
from ctxeng.integrations import ask_claude

ctx = (
    ContextBuilder(".")
    .for_model("claude-sonnet-4")
    .include_files("tests/test_payment.py", "src/payment/service.py")
    .with_system("You are a Python debugging expert.")
    .build("test_charge_user is failing with a KeyError on 'amount'")
)
response = ask_claude(ctx)

Code review on a PR

# Only include what changed in this branch vs main
ctx = (
    ContextBuilder(".")
    .for_model("gpt-4o")
    .from_git_diff(base="main")
    .with_system("Do a thorough code review. Flag security issues first.")
    .build("Review this pull request")
)

Explain an unfamiliar codebase

from ctxeng import ContextEngine

engine = ContextEngine(
    root="/path/to/project",
    model="gemini-1.5-pro",  # 1M token window → include everything
)
ctx = engine.build("Give me a high-level architecture overview")
print(ctx.to_string())

Targeted refactor

ctx = (
    ContextBuilder(".")
    .for_model("claude-sonnet-4")
    .only("src/database/**/*.py")
    .exclude("**/*_test.py")
    .build("Convert all raw SQL queries to use SQLAlchemy ORM")
)

API Reference

`ContextEngine`

ContextEngine(
    root=".",               # Project root
    model="claude-sonnet-4",# Sets token budget automatically
    budget=None,            # Or explicit TokenBudget(total=50_000)
    max_file_size_kb=500,   # Skip files larger than this
    include_patterns=None,  # ["**/*.py"] — only these files
    exclude_patterns=None,  # ["tests/**"] — skip these
    use_git=True,           # Use git recency signal
)

engine.build(
    query="",               # What you want the LLM to do
    files=None,             # Explicit list of paths (skips auto-discovery)
    git_diff=False,         # Only changed files
    git_base="HEAD",        # Diff base ref
    system_prompt="",       # System prompt (counts against budget)
    fmt="xml",              # "xml" | "markdown" | "plain"
)
# → Context

`ContextBuilder` (fluent API)

ContextBuilder(root=".")
    .for_model("gpt-4o")
    .with_budget(total=50_000, reserved_output=4096)
    .only("**/*.py", "**/*.yaml")
    .exclude("tests/**", "migrations/**")
    .include_files("src/specific.py")
    .from_git_diff(base="main")
    .with_system("You are an expert Python engineer.")
    .max_file_size(200)     # KB
    .no_git()
    .build("query")
# → Context

`Context`

ctx.to_string(fmt="xml")    # → str ready to paste into an LLM
ctx.summary()               # → human-readable summary with token counts
ctx.files                   # → list[ContextFile], sorted by relevance
ctx.skipped_files           # → files that didn't fit the budget
ctx.total_tokens            # → estimated token usage
ctx.budget.available        # → remaining token budget

`TokenBudget`

TokenBudget.for_model("claude-sonnet-4")  # auto-detect limit
TokenBudget(total=50_000, reserved_output=2048, reserved_system=512)

Supported models (auto-detected): claude-opus-4, claude-sonnet-4, claude-haiku-4, gpt-4o, gpt-4-turbo, gpt-4, gpt-3.5-turbo, gemini-1.5-pro, gemini-1.5-flash, llama-3.

CLI Reference

ctxeng [--root PATH] <command> [options]

Commands:
  build   Build context for a query
  info    Show project info and file stats

build options:
  --model, -m     Target model (default: claude-sonnet-4)
  --fmt, -f       Output format: xml | markdown | plain (default: xml)
  --output, -o    Write to file instead of stdout
  --only          Glob patterns to include
  --exclude       Glob patterns to exclude
  --files         Explicit file list
  --git-diff      Only include git-changed files
  --git-base      Git base ref (default: HEAD)
  --system        System prompt text
  --budget        Override total token budget
  --no-git        Disable git recency scoring
  --max-size      Max file size in KB (default: 500)

Supported Models

Model	Context window	Auto-detected
claude-opus-4, claude-sonnet-4, claude-haiku-4	200K	✓
gpt-4o, gpt-4-turbo	128K	✓
gpt-4	8K	✓
gpt-3.5-turbo	16K	✓
gemini-1.5-pro, gemini-1.5-flash	1M	✓
llama-3	32K	✓
any other	32K (safe default)	—

Why not just paste files manually?

You could. But you'll hit these problems immediately:

Token limit errors — too many files, context overflows
Irrelevant noise — wrong files dilute signal, hurt output quality
Stale context — you forget to update when code changes
Manual effort — figuring out which files matter takes time

ctxeng solves all four. The right files, in the right order, trimmed to fit, every time.

Roadmap

Semantic similarity scoring (optional embedding model)
ctxeng watch — auto-rebuild context on file changes
VSCode extension
Import graph analysis (include files imported by relevant files)
.ctxengignore file support
Streaming context into LLM APIs

Contributing

PRs welcome! See CONTRIBUTING.md.

git clone https://github.com/sayeem3051/python-context-engineer
cd python-context-engineer
pip install -e ".[dev]"
pytest

License

MIT. Use freely, modify as needed, contribute back if you can.

If ctxeng saved you time, please ⭐ the repo — it helps others find it.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.9

Apr 13, 2026

0.1.8

Apr 13, 2026

0.1.7

Apr 13, 2026

0.1.6

Apr 8, 2026

0.1.5

Apr 7, 2026

0.1.4

Apr 7, 2026

0.1.3

Apr 5, 2026

0.1.2

Apr 5, 2026

This version

0.1.0

Apr 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctxeng-0.1.0.tar.gz (20.7 kB view details)

Uploaded Apr 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ctxeng-0.1.0-py3-none-any.whl (21.3 kB view details)

Uploaded Apr 4, 2026 Python 3

File details

Details for the file ctxeng-0.1.0.tar.gz.

File metadata

Download URL: ctxeng-0.1.0.tar.gz
Upload date: Apr 4, 2026
Size: 20.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for ctxeng-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c461a27358a57b731301fe41cec46a5396640206684f897b99308c4889fc417b`
MD5	`642439160fa58b947dc3ff28d31e7c4f`
BLAKE2b-256	`0f4780e35d3f816f6d45aa38f303a7cbb315e06dc34a6cbf57d8c9bef4d7ada1`

See more details on using hashes here.

File details

Details for the file ctxeng-0.1.0-py3-none-any.whl.

File metadata

Download URL: ctxeng-0.1.0-py3-none-any.whl
Upload date: Apr 4, 2026
Size: 21.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for ctxeng-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`49288c5318aa7064e86a4a08582329bc939fa826577e2b7f2128e3b00801211b`
MD5	`31e37d087a33de0c3624b70ddc3e9767`
BLAKE2b-256	`2bcefedfd75ed028a0767fd6bb56be8e8686247c3a9858ab12379bf56c84dce8`

See more details on using hashes here.

ctxeng 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ctxeng — Python Context Engineering Library

Installation

Quickstart

Python API

Fluent Builder API

One-line LLM call

CLI

How It Works

Scoring signals

Token budget optimization

Examples

Debug a failing test

Code review on a PR

Explain an unfamiliar codebase

Targeted refactor

API Reference

ContextEngine

ContextBuilder (fluent API)

Context

TokenBudget

CLI Reference

Supported Models

Why not just paste files manually?

Roadmap

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`ContextEngine`

`ContextBuilder` (fluent API)

`Context`

`TokenBudget`