Engineering notebook for AI-assisted development

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

peleke

These details have not been verified by PyPI

Project description

buildlog

A measurable learning loop for AI-assisted work

Track what works. Prove it. Drop what doesn't.

RE: The art. Yes, it's AI-generated. Yes, that's hypocritical for a project about rigor over vibes. Looking for an actual artist to pay for a real logo. If you know someone good, open an issue or DM me. Budget exists.

Read the full documentation

The Problem

Most AI agents do not learn. They execute without retaining context. You can bolt on memory stores and tool routers, but if the system cannot demonstrably improve its decision-making over time, you have a persistent memory store, not a learning system.

Every AI-assisted work session produces a trajectory: goals, decisions, tool uses, corrections, outcomes. Almost all of this is discarded. The next session starts from scratch with the same blind spots.

buildlog exists to close that gap. It captures structured trajectories from real work, extracts decision patterns, and uses statistical methods to select which patterns to surface in future sessions, then measures whether that selection actually reduced mistakes.

buildlog measures whether the system actually got better, and proves it.

How It Works

1. Capture structured work trajectories

Each session is a dated entry documenting what you did, what went wrong, and what you learned. Each session is a structured record of decisions and outcomes, not a chat transcript.

buildlog init          # scaffold a project
buildlog new my-feature   # start a session
# ... work ...
buildlog commit -m "feat: add auth"

2. Extract decision patterns as seeds

The seed engine watches your development patterns and extracts seeds: atomic observations about what works. A seed might be "always define interfaces before implementations" or "mock at the boundary, not the implementation." Each seed carries a category, a confidence score, and source provenance.

Extraction runs through a pipeline: sources -> extractors -> categorizers -> generators. Extractors range from regex-based (fast, cheap, brittle) to LLM-backed (accurate, expensive). The pipeline deduplicates semantically using embeddings.

3. Select which patterns to surface using Thompson Sampling

Seeds compete for inclusion in your agent's instruction set. The system treats each seed as an arm in a contextual bandit and uses Thompson Sampling to balance exploration (trying under-tested rules) against exploitation (surfacing rules with strong track records).

Each seed maintains a Beta posterior updated by observed outcomes. Over time, the system converges on the rules that actually reduce mistakes in your specific codebase and workflow, not rules that sound good in the abstract.

4. Render to every agent format

Selected rules are written into the instruction files your agents actually read:

CLAUDE.md (Claude Code)
.cursorrules (Cursor)
.github/copilot-instructions.md (GitHub Copilot)
Windsurf, Continue.dev, generic settings.json

The same knowledge base renders to every agent format.

buildlog skills   # render current policy to agent files

5. Close the loop with experiments

Track whether the selected rules are working. Run experiments, measure Repeated Mistake Rate (RMR) across sessions, and get statistical evidence, not feelings, about what improved.

buildlog experiment start
# ... work across sessions ...
buildlog experiment end
buildlog experiment report

What Else Is In the Box

Review gauntlet: automated quality gate with curated reviewer personas. Runs on commits (via Claude Code hooks or CI) and files GitHub issues for findings, categorized by severity.
LLM-backed extraction: when regex isn't enough, the seed engine can use OpenAI, Anthropic, or Ollama to extract patterns from code and logs. Metered backend tracks token usage and cost.
MCP server: buildlog exposes itself as an MCP server so agents can query seeds, skills, and build history programmatically during sessions.
npm wrapper: npx @peleke.s/buildlog for JS/TS projects. Thin shim that finds and invokes the Python CLI.

Current Limits

This is v0.8, not the end state.

Extraction quality is uneven. Regex extractors miss nuance; LLM extractors are accurate but expensive. The middle ground is still being found.
Feedback signals are coarse. Repeated Mistake Rate works but requires manual tagging. Richer automatic signals (test outcomes, review results, revision distance) are on the roadmap.
Credit assignment is limited. When multiple rules are active, the system doesn't yet isolate which one was responsible for an outcome.
Single-agent only. Multi-agent coordination (shared learning across agents) is designed but not implemented.
Long-horizon learning is not modeled. The bandit operates per-session. Longer arcs of competence building need richer policy models.

The roadmap: contextual bandits (now) -> richer policy models -> longer-horizon RL -> multi-agent coordination. Each step builds on the same foundation: measuring whether rule changes actually reduce mistakes.

Installation

Always-On Mode (recommended)

We run buildlog as an ambient data capture layer across all projects. One command, works everywhere:

pipx install buildlog         # or: uv tool install buildlog
buildlog init-mcp --global    # registers MCP + writes instructions to ~/.claude/CLAUDE.md

That's it. Claude Code now has all 29 buildlog tools and knows how to use them in every project you open. No per-project setup needed.

The --global flag:

Registers the MCP server in ~/.claude/settings.json
Creates ~/.claude/CLAUDE.md with usage instructions so Claude proactively uses buildlog
Works immediately in any repo, even without a local buildlog/ directory

This is how we use buildlog ourselves: always on, capturing structured trajectories from every session, feeding downstream systems that generate engineering logs, courses, and content.

Per-project setup

If you prefer explicit per-project control:

pip install buildlog          # MCP server included by default
buildlog init --defaults      # scaffold buildlog/, register MCP, update CLAUDE.md

This creates a buildlog/ directory with templates and configures Claude Code for that specific project.

For JS/TS projects

npx @peleke.s/buildlog init

Verify installation

buildlog mcp-test          # verify all 29 tools are registered
buildlog overview          # check project state (works without init in global mode)

Quick Start

buildlog init --defaults      # scaffold + MCP + CLAUDE.md
buildlog new my-feature       # start a session
# ... work ...
buildlog commit -m "feat: add auth"
buildlog experiment start
# ... work across sessions ...
buildlog experiment end
buildlog experiment report

Documentation

Section	Description
Installation	Setup, extras, and initialization
Quick Start	Full pipeline walkthrough
Core Concepts	The problem, the claim, and the metric
CLI Reference	Every command documented
MCP Integration	Claude Code setup and available tools
Experiments	Running and measuring experiments
Review Gauntlet	Reviewer personas and the gauntlet loop
Multi-Agent Setup	Render rules to any AI coding agent
Theory	The math behind Thompson Sampling
Philosophy	Principles and honest limitations

Contributing

git clone https://github.com/Peleke/buildlog-template
cd buildlog-template
uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"
pytest

We're especially interested in better context representations, credit assignment approaches, statistical methodology improvements, and real-world experiment results (positive or negative).

License

MIT License. See LICENSE

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

peleke

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.23.0

Mar 14, 2026

0.22.0

Mar 14, 2026

0.21.1

Mar 12, 2026

0.21.0

Mar 12, 2026

0.20.0

Mar 7, 2026

0.18.4

Feb 15, 2026

0.18.2

Feb 14, 2026

0.18.1

Feb 14, 2026

0.15.0

Feb 13, 2026

0.13.1

Feb 7, 2026

0.13.0

Feb 6, 2026

0.12.0

Feb 5, 2026

0.11.1

Feb 5, 2026

0.11.0

Feb 5, 2026

0.10.5

Feb 4, 2026

This version

0.10.4

Feb 4, 2026

0.10.3

Feb 4, 2026

0.10.2

Feb 4, 2026

0.10.1

Feb 4, 2026

0.10.0

Feb 4, 2026

0.9.0

Feb 2, 2026

0.8.0

Jan 31, 2026

0.7.0

Jan 22, 2026

0.6.1

Jan 22, 2026

0.6.0

Jan 22, 2026

0.5.0

Jan 22, 2026

0.4.0

Jan 17, 2026

0.3.0

Jan 17, 2026

0.2.0

Jan 16, 2026

0.1.0

Jan 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

buildlog-0.10.4.tar.gz (126.2 kB view details)

Uploaded Feb 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

buildlog-0.10.4-py3-none-any.whl (154.6 kB view details)

Uploaded Feb 4, 2026 Python 3

File details

Details for the file buildlog-0.10.4.tar.gz.

File metadata

Download URL: buildlog-0.10.4.tar.gz
Upload date: Feb 4, 2026
Size: 126.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for buildlog-0.10.4.tar.gz
Algorithm	Hash digest
SHA256	`491f2e3e53394fb5ad2e42f288930332bb7e67da260eb6f1830903a38c8df2b6`
MD5	`7e2333ecf361d7db23cfaf3fa3772bd5`
BLAKE2b-256	`9ffc7992c7dd2d156963f74e928453f60161f7a73c6fd470a7c4a3a36c96bc74`

See more details on using hashes here.

Provenance

The following attestation bundles were made for buildlog-0.10.4.tar.gz:

Publisher: publish.yml on Peleke/buildlog-template

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: buildlog-0.10.4.tar.gz
- Subject digest: 491f2e3e53394fb5ad2e42f288930332bb7e67da260eb6f1830903a38c8df2b6
- Sigstore transparency entry: 910918688
- Sigstore integration time: Feb 4, 2026
Source repository:
- Permalink: Peleke/buildlog-template@483350b3aa8f8f9b2d129c505cf901713c165c05
- Branch / Tag: refs/tags/v0.10.4
- Owner: https://github.com/Peleke
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@483350b3aa8f8f9b2d129c505cf901713c165c05
- Trigger Event: push

File details

Details for the file buildlog-0.10.4-py3-none-any.whl.

File metadata

Download URL: buildlog-0.10.4-py3-none-any.whl
Upload date: Feb 4, 2026
Size: 154.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for buildlog-0.10.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b7981a4338b0352d231a0021d82d2604c042ebe0524c95f4186ed715105e7482`
MD5	`82497a1e93b85f5d2e25c4659d772122`
BLAKE2b-256	`14c5c2c1b27f0b920c14ff3fec2577b4823721ed4d9d5ed33ba372d041a05e2c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for buildlog-0.10.4-py3-none-any.whl:

Publisher: publish.yml on Peleke/buildlog-template

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: buildlog-0.10.4-py3-none-any.whl
- Subject digest: b7981a4338b0352d231a0021d82d2604c042ebe0524c95f4186ed715105e7482
- Sigstore transparency entry: 910918697
- Sigstore integration time: Feb 4, 2026
Source repository:
- Permalink: Peleke/buildlog-template@483350b3aa8f8f9b2d129c505cf901713c165c05
- Branch / Tag: refs/tags/v0.10.4
- Owner: https://github.com/Peleke
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@483350b3aa8f8f9b2d129c505cf901713c165c05
- Trigger Event: push

buildlog 0.10.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

buildlog

A measurable learning loop for AI-assisted work

The Problem

How It Works

1. Capture structured work trajectories

2. Extract decision patterns as seeds

3. Select which patterns to surface using Thompson Sampling

4. Render to every agent format

5. Close the loop with experiments

What Else Is In the Box

Current Limits

Installation

Always-On Mode (recommended)

Per-project setup

For JS/TS projects

Verify installation

Quick Start

Documentation

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance