Skip to main content

Context Rail — agentic task-management that keeps your project on rails.

Project description

Context-Rail

A local project rail for long-running agentic work.

AI agents are good at individual tasks. Long projects drift — they forget the plan, lose context between sessions, repeat bad actions, mark work done without review, and start every session like the project is new.

Context-Rail keeps the project on rail by giving both the human and the agent the same structured project state. It is not a chatbot memory file. It is not a code indexer. It is a local operating loop for agent work.

PLAN → EXECUTE → REVIEW → ACT

Built with Python + SQLite + MCP.


Install

Core (MCP server + CLI — lightweight, no web deps):

pip install context-rail

With web UI (adds FastAPI + Uvicorn):

pip install "context-rail[web]"

Initialize:

ctx-rail init

This creates the local SQLite database, generates an AGENTS.md in the current directory (so your agent knows the CR workflow), and installs CR skills to .agents/skills/context-rail/.

Connect to your MCP client:

ctx-rail config

This prints the correct config with the absolute Python path — important on Windows where bare python may resolve to the wrong interpreter. Paste it into your MCP client config:

{
  "mcpServers": {
    "context-rail": {
      "command": "C:/Python313/python.exe",
      "args": ["-m", "context_rail", "serve"]
    }
  }
}

For Codex (TOML format):

ctx-rail config --client codex

Update later (upgrades the package + refreshes AGENTS.md + skills):

ctx-rail update

Start the MCP server:

ctx-rail serve

Start the web platform (requires [web] install):

ctx-rail web                # API + React UI (default)
ctx-rail web --no-frontend  # API only

A session with Context-Rail

This is what it feels like to work with CR — not a tool catalog, but the flow an agent actually follows.

1. Start — the agent already knows

The agent opens a session and calls one tool:

session_start(agent_id="codex-1", workdir="/path/to/repo")

It gets back everything it needs in one response:

  • which project is active (via workspace binding — survives across sessions);
  • the roadmap story — intent, success criteria, non-goals;
  • the current phase and its progress;
  • the next task with dependencies and context items;
  • open warnings (blocked tasks, tasks needing evidence);
  • governance status — is there an intent tree? an approved plan contract?

If the workspace isn't bound yet, the agent binds it once:

project_login("project-id", agent_id="codex-1", workdir="/path/to/repo")

From then on, every session in that workspace opens with the correct project automatically.

The agent does not guess what to do next. It does not start from zero.

2. Plan — natural brainstorm, not a blank form

Before jumping into tasks, CR runs a brainstorm — a natural conversation proportional to the idea. It can shape:

  • a whole project (vision, phases, roadmap, risks);
  • one phase or one goal;
  • one feature or task;
  • a refinement or an option comparison;
  • an execution plan.

The conversation stays friendly. The agent drafts the plan. The human reviews it. Only approved plans become committed project work.

Approval creates a plan contract — not just old context. That contract carries the goal, quality bar, autonomy boundaries, evidence rules, stop rules, plan snapshot, drift rules, and rail-check policy that later execution must follow.

3. Focus — context sticks to the task

The agent calls:

focus(agent_id="codex-1", agent_name="Codex", harness="codex", role="builder")

It gets the execution packet for the next task:

  • task description, dependencies, notes, handoffs;
  • risks, lessons, artifacts, related goals;
  • active agents working on the task;
  • review state;
  • PDCA stage, contract constraints, required evidence, stop conditions, rail-check status.

Context is not floating in a chat log. It is attached to the task. The agent loads what it needs, not the whole project.

When agent_id is provided, CR clocks that agent into the task. If the agent was active on another task, the older session is released. When the task is done or blocked, active work sessions close. This keeps ownership flexible while answering: "which agent is doing this right now?"

4. Work — guardrails that adapt to risk

The agent works freely — editing, testing, browsing, debugging. CR does not micromanage every tool call.

For risky or repeated actions, the agent can call:

check_action(kind="retry", context_json='{"path": "/abs/path", "error": "..."}')

CR returns Allow, Warn, or Block based on project rules.

The key design point: guardrails adapt to risk. Not every task needs the same ceremony:

Lane Tier Behavior
tiny T0 Skip PDCA entirely — just flip status to done
normal T1 Lightweight cycle check, evidence validation relaxed
high_risk T2 Full governance — evidence, review, drift, rail-check

This means a typo fix doesn't go through the same pipeline as an architecture change. The agent picks the lane when adding the task; CR handles the rest.

5. Done — evidence is harvested, not asked for

When the agent calls done(), CR does not ask "please attach evidence." It harvests evidence automatically — git diffs, test results, file changes — and records it to the PDCA cycle.

If evidence is insufficient, the task is flagged needs_evidence — visible in the next session_start() or focus() call. The agent can supplement it or the human can override.

Evidence is validated before it satisfies a required rule. A failing test result or empty manual check does not count as proof.

done() also returns the next task with its context — so the agent can continue without a separate focus() call.

6. Review — lessons become future work

After a phase, the agent or human runs a retro:

retro(phase_id="...")

Then turns improvements into next-phase tasks:

retro(action="act")

Lessons don't stay as passive notes. They affect future tasks, risks, rules, and reviews. Decisions are recorded with decision_create() so direction changes are visible and queryable.

7. Resume — next session picks up where this one left

CR stores project state locally in SQLite. The next session recovers:

  • current project (via workspace binding);
  • active plan contract and autonomy run;
  • PDCA cycle state, rail-check result;
  • active phase, task queue, blocked work;
  • notes, lessons, risks, artifacts, rules, review state, retros.

This makes CR useful for multi-day, multi-session, and multi-agent work.


How Context-Rail adapts

Governance on/off

Not every project needs heavy governance. project_create(governance=False) switches to lightweight mode:

  • check_action returns Allow or Warn — never Block;
  • PDCA runs non-blocking — warnings instead of gates;
  • direction guard is off;
  • auto-evidence harvesting still works (always on).

This is the "OS background" metaphor: CR runs quietly, records what happened, warns when something looks off — but doesn't get in the way.

Intent tree — the living plan

Plans are not frozen at approval. CR maintains an intent tree — a living structure of what the project is trying to achieve:

intent_plant(title="User auth", intent="Secure login without friction",
             success_criteria="2FA optional, <3s login", non_goals="No SSO")
intent_list()
intent_check(task_title="Add OAuth provider")  →  matches or warns off-plan

When a new task doesn't match any intent node, CR warns. The tree grows with the project — plant new intents as scope evolves.

Context items — knowledge that sticks

CR has a scoped knowledge store (context_items) that attaches to tasks, phases, goals, and roadmaps:

context_add(scope_type="task", scope_id="...", content="API uses Bearer token", kind="reference")
context_pack(task_id="...")     →  relevant items for that task
context_wiki()                  →  browse all knowledge grouped by kind
context_verify(id="...")        →  refresh stale items
context_health()                →  coverage and freshness report

Context items have lifecycle: seed → grow → maintain → serve. Stale items are flagged. The agent uses context_apply() to mark items as used, keeping the knowledge loop alive.

Roadmaps — project genesis

Roadmaps are the entry point for every project — high-level initiatives, milestones, or bets that define the vision. Each roadmap carries:

  • intent — why it matters strategically;
  • success_criteria — what done looks like;
  • background — context for the initiative;
  • non_goals — what's explicitly out of scope;
  • stakeholders_note — who cares and why;
  • progress — auto-rolled up from phases → tasks.

The brainstorm conductor Stage 0 asks for initiatives before phases. Roadmaps link to phases; phases link to tasks; done tasks roll progress up to the roadmap automatically.

Multi-agent lanes

Different agents can have narrower rails:

rail_policy_set(agent_id="ui-bot", task_tags="ui")
rail_policy_set(agent_id="api-bot", task_tags="backend")

An agent-specific rail can restrict what that agent touches. It cannot loosen stop rules, remove evidence requirements, or bypass phase gates.

runtime_status() is the single source of truth — it reports whether the agent can proceed and, if not, an ordered list of blockers across all enforcement layers (rail-check, PDCA, lease, drift).

Unified configuration

One tool replaces six:

configure(action="update_project", name="...", budget="high")
configure(action="add_rule", rule_kind="retry", rule_threshold='{"max": 3}')
configure(action="list_rules")
configure(action="remove_rule", rule_id="...")
configure(action="reset_session")

For humans

CR answers the questions a human actually has:

What are we building?         → roadmap story
Where are we now?             → phase + progress
What should happen next?      → next task
What is blocked?              → warnings in session_start
What needs review?            → review state on tasks
What did we learn?            → lessons, retros, decisions
What should change next phase? → retro → act

The human does not micromanage every tool call. They review direction, approve plans, resolve blockers, and keep authority over important decisions.


For agents

1. session_start(agent_id="...", workdir="/path/to/repo")
   If unbound, call project_login() to bind the workspace.

2. Read the response — project, phase, next task, context, warnings, governance.

3. autonomy_start() to activate PDCA evidence loop and lease tracking.
   (Without this, done()/block() skip the evidence gate.)

4. focus() on the next task. Pass agent_id if the harness has identity.

5. Work freely. Save findings with note(). Save lessons with learn().
   Record direction changes with decision_create().

6. done() when complete — evidence is auto-harvested. Next task is returned.
   block() when stuck or a stop rule fires.

7. retro() when a phase ends. Convert improvements into future tasks.

If unsure whether a stop is in effect, call runtime_status().

Core objects

Project
  ├─ Roadmap (vision, intent, success criteria, progress rollup)
  │   └─ Phase
  │       └─ Task (lane: tiny | normal | high_risk)
  │           ├─ Context items (scoped knowledge)
  │           ├─ Dependencies
  │           ├─ Handoffs
  │           ├─ Active agent sessions
  │           └─ Review state
  ├─ Goals ←→ Tasks (N:N links)
  ├─ Intent nodes (living plan tree)
  ├─ Plan contracts (approved plan → execution binding)
  ├─ Autonomy runs / PDCA cycles (lease, evidence, drift)
  ├─ Categories
  │   ├─ Risks
  │   ├─ Lessons
  │   └─ Artifacts
  ├─ Rules
  ├─ Gates
  └─ Retros
Diagram: data model
erDiagram
    PROJECT ||--o{ ROADMAP : defines
    ROADMAP ||--o{ PHASE : scopes
    PROJECT ||--o{ PHASE : contains
    PHASE ||--o{ TASK : tracks
    PROJECT ||--o{ GOAL : defines
    GOAL }o--o{ TASK : links
    PROJECT ||--o{ INTENT_NODE : intends
    PROJECT ||--o{ PLAN_CONTRACT : binds
    PROJECT ||--o{ AUTONOMY_RUN : executes
    AUTONOMY_RUN ||--o{ PDCA_CYCLE : cycles
    TASK ||--o{ CONTEXT_ITEM : carries
    TASK ||--o{ AGENT_WORK_SESSION : has
    PROJECT ||--o{ RULE : guards
    PROJECT ||--o{ GATE : reviews
    PHASE ||--o{ RETRO : reviews
    CATEGORY ||--o{ RISK : owns
    CATEGORY ||--o{ LESSON : owns
    CATEGORY ||--o{ ARTIFACT : owns

Useful commands

ctx-rail init       # Initialize database + AGENTS.md + skills
ctx-rail serve      # Start MCP server (stdio)
ctx-rail config     # Print MCP client config (correct Python path)
ctx-rail config --client codex  # TOML format for Codex
ctx-rail status     # Show active project, DB location, counts
ctx-rail doctor     # Health check: DB, schema, skills, AGENTS.md, MCP config
ctx-rail web        # Start web platform (requires [web] install)
ctx-rail update     # Upgrade package + refresh AGENTS.md + skills

What Context-Rail is not

  • a code indexer;
  • a semantic search engine;
  • a replacement for Git, Jira, or Linear;
  • a replacement for Cursor, Claude Code, Codex, or GitNexus.

Context-Rail sits beside those tools as the agent work rail. Use code intelligence tools to understand the codebase. Use Context-Rail to keep the project plan, task flow, reviews, risks, lessons, and execution memory on track.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_rail-0.1.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_rail-0.1.1-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file context_rail-0.1.1.tar.gz.

File metadata

  • Download URL: context_rail-0.1.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for context_rail-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6480abdcdaeac7ee9b5f354684098f227d9f25287c229424fea0766dd39db6cf
MD5 b633edaba5c159e8c931e982d9b961c4
BLAKE2b-256 d3ad8bceea56144d0da5046a6b2d1b2c2034a8f40a3db5e8255024fb2ab33e2b

See more details on using hashes here.

File details

Details for the file context_rail-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: context_rail-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for context_rail-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9b2aa5deafba86c2e76c277183e969a220d6af92412de23a5011bbbd396548f9
MD5 9fa48d8ce5c238f3e1060380d2afd862
BLAKE2b-256 ec35d7bc6a8c6975289bd2c985d856c566f0981c0bb1ba3d28891757ad816a23

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page