A coding agent that learns from your corrections in real-time via on-policy self-distillation

These details have not been verified by PyPI

Project links

Project description

continualcode

A coding agent that learns from your corrections in real-time. Built on Tinker.

When you deny a tool call with feedback, the model uses your correction as context to teach itself via on-policy self-distillation, takes a gradient step on LoRA, and retries with updated weights.

You: "fix the test"
Agent: write(test.py, ...)       # overwrites the file
You: n → "use edit_lines; don't overwrite"
  → SDPO update runs immediately
  → agent retries with updated weights
Agent: edit_lines(test.py, 14, 17, ...)
You: y

Install

pip install continualcode
export TINKER_API_KEY=<your-key>
continualcode

How it works

Four feedback types, one training signal. Your correction becomes privileged context for a self-teacher (same model, richer input). Per-token KL between teacher and student = dense training signal — O(N) bits per correction, not O(1). One gradient step on LoRA, retry with updated weights.

Full explanation →

Why not DPO / GRPO / PPO / SFT

DPO needs preference pairs and has no per-token credit. GRPO needs 64 samples per prompt — absurd UX for a CLI. PPO doubles memory with a critic network. SFT on corrections is off-policy and causes catastrophic forgetting. Self-distillation is the unique intersection: dense signal, on-policy, no extra models, mode-seeking stability.

Full reasoning →

Code layout

train.py — SDPO core: teacher prompt, logprob scoring, IS update, sampler refresh
tui.py — interactive CLI: approve/deny/edit, correction prompt, /metrics
tools.py — tool implementations + structured feedback
benchmarks/auto_train.py — automated training loop (LCB, multi-rollout GRPO + SDPO)
demo/ — deny → train → retry end-to-end

References

SDPO — Hübotter et al. 2026
SDFT — Shenfeld, Damani, Guestrin 2026
GKD — Agarwal et al. 2023
Tinker — training API

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.0

Feb 8, 2026

0.2.0

Feb 5, 2026

0.1.0

Feb 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

continualcode-0.6.0.tar.gz (10.6 MB view details)

Uploaded Feb 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

continualcode-0.6.0-py3-none-any.whl (37.5 kB view details)

Uploaded Feb 8, 2026 Python 3

File details

Details for the file continualcode-0.6.0.tar.gz.

File metadata

Download URL: continualcode-0.6.0.tar.gz
Upload date: Feb 8, 2026
Size: 10.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for continualcode-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`f974e4e3a9ee9ee4a05bff4dddbc17ffe6afe2da0bfc765b829532b0e3b3f2e0`
MD5	`238a2f073f330d5d2e6def09d76584c0`
BLAKE2b-256	`d50c6325a31fc1cceba5ac58620edb116a85fca5a006ad7c445525a50dd03002`

See more details on using hashes here.

File details

Details for the file continualcode-0.6.0-py3-none-any.whl.

File metadata

Download URL: continualcode-0.6.0-py3-none-any.whl
Upload date: Feb 8, 2026
Size: 37.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for continualcode-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0da9b056f14fc66580522ec31a04dfd25cef89e1541f994009f4ef0e367aa31a`
MD5	`ea89be238a393b14b7318f7f422ae633`
BLAKE2b-256	`8025c928a83cd9ef54596a74e4f53d07b7913184e660d64ed95afa5f1e410fa1`

See more details on using hashes here.

continualcode 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

continualcode

Install

How it works

Why not DPO / GRPO / PPO / SFT

Code layout

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes