Python implementation of Git-Context-Controller (GCC)
Project description
contexa
A Python implementation of the contexa framework -- Git-inspired context management for LLM agents.
Based on: arXiv:2508.00031 -- "Git Context Controller: Manage the Context of LLM-based Agents like Git" (Junde Wu et al., 2025)
Table of Contents
- The Problem
- How GCC Solves It
- Installation
- Quick Start
- Core Concepts
- API Reference
- Directory Structure
- Data Models
- Real-World Example
- Running Tests
- Contributing
- Requirements
- License
- Citation
- Links
The Problem
LLM-based agents (like coding assistants, research agents, or autonomous planners) accumulate observations, thoughts, and actions over time. But context windows are finite. As conversations grow, agents lose track of earlier reasoning, repeat mistakes, or forget prior decisions.
Current approaches either:
- Dump the entire history into the prompt (expensive, hits token limits)
- Use simple summarization (loses critical details)
- Have no structured way to explore alternative strategies
How GCC Solves It
GCC borrows Git's branching model to give agents structured, versioned memory:
main
|
init ──> log OTA ──> COMMIT ──> COMMIT ──> MERGE <──┐
| |
BRANCH ──> COMMIT ────┘
(experiment)
| Concept | Git Equivalent | What It Does |
|---|---|---|
| OTA Log | Working directory | Continuous trace of Observation-Thought-Action cycles |
| COMMIT | git commit |
Saves a milestone summary, compressing older OTA steps |
| BRANCH | git branch |
Creates an isolated workspace for alternative reasoning |
| MERGE | git merge |
Integrates a successful branch back into main |
| CONTEXT | git log |
Retrieves historical context at varying resolutions (K commits) |
The key insight from the paper: by controlling how much history the agent sees (the K parameter in CONTEXT), you can balance between detailed recent context and compressed older summaries.
Installation
pip install contexa
Or with uv:
uv add contexa
Quick Start
from contexa import GCCWorkspace
# 1. Initialize a workspace
ws = GCCWorkspace("/path/to/project")
ws.init("Build a REST API service with user auth")
# 2. Agent logs its reasoning as it works
ws.log_ota(
observation="Project directory is empty",
thought="Need to scaffold the project structure first",
action="create_files(['main.py', 'requirements.txt', 'models.py'])"
)
ws.log_ota(
observation="Files created successfully",
thought="Now implement the user model",
action="write_code('models.py', user_model_code)"
)
# 3. Commit a milestone (compresses OTA history)
ws.commit("Project scaffold and User model complete")
# 4. Branch to explore an alternative approach
ws.branch("auth-jwt", "Explore JWT-based authentication instead of sessions")
ws.log_ota("Reading JWT docs", "JWT is stateless, good for APIs", "implement_jwt()")
ws.commit("JWT auth middleware implemented")
# 5. Merge the successful branch back
ws.merge("auth-jwt")
# 6. Retrieve context for the agent's next step
ctx = ws.context(k=1) # K=1: only the most recent commit (paper default)
print(ctx.summary())
Core Concepts
1. OTA Logging (Observation-Thought-Action)
Every reasoning step an agent takes is an OTA cycle. These are logged continuously in log.md:
rec = ws.log_ota(
observation="API returns 500 error on /users endpoint",
thought="The database connection might not be initialized",
action="check_db_connection()"
)
print(rec.step) # 1 (auto-incremented)
print(rec.timestamp) # 2025-03-04T12:00:00+00:00
This produces a markdown entry:
### Step 1-2025-03-04T12:00:00+00:00
**Observation:** API returns 500 error on /users endpoint
**Thought:** The database connection might not be initialized
**Action:** check_db_connection()
--------
2. COMMIT - Save Milestones
When the agent reaches a significant checkpoint, commit it. This creates a structured summary that can be retrieved later without replaying every OTA step:
commit = ws.commit(
contribution="Fixed database connection and /users endpoint now returns 200",
update_roadmap="Database layer is stable, move to auth next" # optional
)
print(commit.commit_id) # "a3f2b1c4" (8-char UUID)
print(commit.branch_name) # "main"
The previous_progress_summary is auto-populated from the last commit if not provided.
3. BRANCH - Explore Alternatives
When an agent wants to explore a different strategy without risking the main trajectory:
# Creates isolated workspace with fresh OTA log
ws.branch("redis-cache", "Try Redis caching instead of in-memory")
# Agent works in the branch
ws.log_ota("Redis docs reviewed", "Need redis-py package", "pip_install('redis')")
ws.commit("Redis caching layer implemented")
# Check what branches exist
print(ws.list_branches()) # ['main', 'redis-cache']
print(ws.current_branch) # 'redis-cache'
Each branch gets its own:
log.md-- fresh OTA trace (no carry-over from parent)commit.md-- independent commit historymetadata.yaml-- records why the branch was created and from where
4. MERGE - Integrate Results
When a branch's exploration succeeds, merge it back:
merge_commit = ws.merge("redis-cache", target="main")
# - Appends the branch's OTA trace to main's log
# - Creates a merge commit on main
# - Marks the branch as "merged" in its metadata
After merging, ws.current_branch automatically switches back to the target.
5. CONTEXT - Retrieve History
The CONTEXT command is the agent's way of "remembering". The K parameter controls resolution:
# K=1: Only the most recent commit (paper's recommended default)
ctx = ws.context(k=1)
# K=3: Last 3 commits for more detailed history
ctx = ws.context(k=3)
# Access the structured result
print(ctx.branch_name) # "main"
print(ctx.main_roadmap) # Global project roadmap from main.md
print(ctx.commits) # List of last K CommitRecord objects
print(ctx.ota_records) # All OTA records on the branch
print(ctx.metadata) # BranchMetadata object
# Get a formatted markdown summary ready to inject into an LLM prompt
prompt_context = ctx.summary()
The paper's experiments (Table 2, Section 4) show that K=1 performs best in most benchmarks -- agents do better with compressed recent context than with full history dumps.
API Reference
GCCWorkspace
| Method | Parameters | Returns | Description |
|---|---|---|---|
__init__ |
project_root: str |
-- | Set the project root directory |
init |
project_roadmap: str = "" |
None |
Create .GCC/ structure with main branch |
load |
-- | None |
Load an existing workspace |
log_ota |
observation, thought, action |
OTARecord |
Append OTA step to current branch |
commit |
contribution, previous_summary=None, update_roadmap=None |
CommitRecord |
Create milestone checkpoint |
branch |
name, purpose |
GCCWorkspace |
Create and switch to new branch |
merge |
branch_name, summary=None, target="main" |
CommitRecord |
Merge branch into target |
context |
branch=None, k=1 |
ContextResult |
Retrieve historical context |
switch_branch |
name |
None |
Switch active branch |
list_branches |
-- | list[str] |
List all branch names |
update_roadmap |
content |
None |
Append to global roadmap |
current_branch |
(property) | str |
Get current active branch name |
Directory Structure
When you call ws.init(), the following structure is created on disk:
your-project/
.GCC/
main.md # Global roadmap / planning artifact
branches/
main/
log.md # Continuous OTA trace
commit.md # Milestone-level commit summaries
metadata.yaml # Branch intent, status, creation info
feature-branch/ # Created by ws.branch()
log.md # Independent OTA trace
commit.md # Independent commit history
metadata.yaml # Why this branch exists
All data is stored as human-readable Markdown and YAML -- you can inspect and debug the agent's memory directly in your editor.
Data Models
| Class | Description | Key Fields |
|---|---|---|
OTARecord |
Single Observation-Thought-Action cycle | timestamp, observation, thought, action, step |
CommitRecord |
Milestone commit snapshot | commit_id, branch_name, branch_purpose, previous_progress_summary, this_commit_contribution, timestamp |
BranchMetadata |
Branch creation intent and status | name, purpose, created_from, created_at, status, merged_into, merged_at |
ContextResult |
Result of CONTEXT retrieval | branch_name, k, commits, ota_records, main_roadmap, metadata |
All models support serialization:
from contexa import OTARecord, BranchMetadata
# OTARecord <-> dict
record = OTARecord.from_dict({"timestamp": "...", "observation": "...", ...})
# BranchMetadata <-> YAML
meta = BranchMetadata(name="main", purpose="Primary trajectory", ...)
yaml_str = meta.to_yaml()
meta_back = BranchMetadata.from_yaml(yaml_str)
# All records can be rendered as Markdown
print(record.to_markdown())
Real-World Example
Here's how an autonomous coding agent might use contexa to manage its memory while building a web application:
from contexa import GCCWorkspace
ws = GCCWorkspace("./my-webapp")
ws.init("Build a Flask web app with user auth, blog posts, and admin panel")
# === Phase 1: Project Setup ===
ws.log_ota("No project files exist", "Start with Flask boilerplate", "scaffold_project()")
ws.log_ota("Flask app created", "Need database models", "create_models()")
ws.log_ota("Models created", "Database migrations needed", "run_migrations()")
ws.commit("Project scaffold with Flask + SQLAlchemy models")
# === Phase 2: Explore auth strategies in parallel branches ===
# Try JWT auth
ws.branch("auth-jwt", "Explore stateless JWT authentication")
ws.log_ota("JWT docs reviewed", "Good for API, complex for sessions", "implement_jwt()")
ws.commit("JWT auth prototype -- works but session handling is messy")
# Go back and try session auth
ws.switch_branch("main")
ws.branch("auth-session", "Explore Flask-Login session authentication")
ws.log_ota("Flask-Login docs reviewed", "Simple, works well with templates", "implement_sessions()")
ws.commit("Session auth prototype -- clean integration with Flask")
# Session auth won, merge it
ws.merge("auth-session")
# === Phase 3: Continue on main with context ===
ctx = ws.context(k=2) # See last 2 commits: the merge + scaffold
# Feed ctx.summary() to the LLM as its "memory"
ws.log_ota("Auth is done", "Now build blog post CRUD", "implement_blog()")
ws.commit("Blog post CRUD with auth-protected routes")
# The agent always knows where it's been, without replaying everything
Running Tests
# Clone the repository
git clone https://github.com/swadhinbiswas/Cortexa.git
cd contexa
# Install dev dependencies and run tests
uv sync
uv run pytest -v
All 13 tests cover the core GCC commands:
test_init_creates_gcc_directory # Workspace initialization
test_log_ota # OTA logging
test_commit # Milestone commits
test_branch_creates_isolated_workspace # Branch creation
test_branch_has_fresh_ota_log # Branch isolation
test_merge_integrates_branch # Branch merging
test_context_k1_returns_last_commit # Context retrieval (K=1)
test_context_k3_returns_last_three # Context retrieval (K=3)
test_context_includes_roadmap # Roadmap in context
test_branch_metadata_records_purpose # Metadata persistence
test_merge_marks_branch_as_merged # Post-merge metadata
test_switch_branch # Branch switching
test_ota_step_increments # Step auto-increment
Contributing
Contributions are welcome! Here's how to get started:
- Fork the repository: https://github.com/swadhinbiswas/Cortexa
- Create a feature branch:
git checkout -b feature/my-feature - Make your changes and add tests
- Run the test suite:
uv run pytest -v - Submit a pull request
Please open an issue first for major changes to discuss the approach.
Requirements
- Python >= 3.10
- PyYAML >= 6.0
No other dependencies. The entire implementation uses Python's standard library (dataclasses, pathlib, uuid, datetime) plus PyYAML for metadata serialization.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Citation
If you use this in research, please cite the original paper:
@article{wu2025gcc,
title={Git Context Controller: Manage the Context of LLM-based Agents like Git},
author={Wu, Junde and others},
journal={arXiv preprint arXiv:2508.00031v2},
year={2025}
}
Links
- GitHub Repository: https://github.com/swadhinbiswas/Cortexa
- PyPI Package: https://pypi.org/project/contexa/
- Issue Tracker: https://github.com/swadhinbiswas/Cortexa/issues
- Original Paper: arXiv:2508.00031v2
- Author: Swadhin Biswas
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file contexa-1.1.2.tar.gz.
File metadata
- Download URL: contexa-1.1.2.tar.gz
- Upload date:
- Size: 28.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c68f4f7dbc327a06e2f9bf3a9ba0794bed73ebffb521b65039ad4cb4621dce9e
|
|
| MD5 |
9f887ef2ef1169e118188aa61c8e7416
|
|
| BLAKE2b-256 |
aa2ca3b3f3e12679b7b1d716e2d15685df0b5faff048857887390411cd16cd77
|
File details
Details for the file contexa-1.1.2-py3-none-any.whl.
File metadata
- Download URL: contexa-1.1.2-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38cd1a013848c2f76f0af7e0e2d342cd7d6a052fb51f04c1e7e23742b3d5e448
|
|
| MD5 |
aa09f1854d93c4aa0e9f237cacd7b7bf
|
|
| BLAKE2b-256 |
b505d5f80698a40bb51969c32cb341d66de0d5e69587618be01c902f5e77516a
|