Skip to main content

Git bisection with Bayesian statistics

Project description

git bayesect

Bayesian git bisection!

Use this to detect changes in likelihoods of events, for instance, to isolate a commit where a slightly flaky test became very flaky.

You don't need to know the likelihoods (although you can provide priors), just that something has changed at some point in some direction

Installation

pip install git_bayesect

Or:

uv tool install git_bayesect

How it works

git_bayesect uses Bayesian inference to identify the commit introducing a change, with commit selection performed via greedy minimisation of expected entropy, and using a Beta-Bernoulli conjugacy trick while calculating posterior probabilities to make handling unknown failure rates tractable.

See https://hauntsaninja.github.io/git_bayesect.html for a write up.

Usage

Start a Bayesian bisection:

git bayesect start --old $COMMIT

Record an observation on the current commit:

git bayesect fail

Or on a specific commit:

git bayesect pass --commit $COMMIT

Check the overall status of the bisection:

git bayesect status

Reset:

git bayesect reset

More usage

Set the prior for a given commit:

git bayesect prior --commit $COMMIT --weight 10

Set prior for all commits based on filenames:

git bayesect priors_from_filenames --filenames-callback "return 10 if any('suspicious' in f for f in filenames) else 1"

Set prior for all commits based on the text in the commit message + diff:

git bayesect priors_from_text --text-callback "return 10 if 'timeout' in text.lower() else 1"

Set the beta priors:

# We expect "fail" observations 90% of the time at new commit, 5% of the time at old commit
git bayesect beta_priors --alpha-new 0.9 --beta-new 0.1 --alpha-old 0.05 --beta-old 0.95

Get a log of commands to let you reconstruct the state:

git bayesect log

Undo the last observation:

git bayesect undo

Run the bisection automatically using a command to make observations:

git bayesect run $CMD

Checkout the best commmit to test:

git bayesect checkout

Demo

This repository contains a little demo, in case you'd like to play around:

# Create a fake repository with a history to bayesect over
python scripts/generate_fake_repo.py
cd fake_repo

# The fake repo contains a script called flaky.py
# This is a simple script that fails some fraction of the time
# At some point in the history of the repo, that fraction was changed
python flaky.py
git log --oneline

# Start the bayesection
OLD_COMMIT=$(git rev-list HEAD --reverse | head -n 2 | tail -n 1)
git bayesect start --new main --old $OLD_COMMIT

# Run a bayesection to find the commit that introduced the change
git bayesect run python flaky.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git_bayesect-1.1.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

git_bayesect-1.1-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file git_bayesect-1.1.tar.gz.

File metadata

  • Download URL: git_bayesect-1.1.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for git_bayesect-1.1.tar.gz
Algorithm Hash digest
SHA256 6e7939da4da7eb4c463e3239fd020544592fce110bb02834c258ba23f1211380
MD5 8c07061010a1f456cd50ff6be8956dcd
BLAKE2b-256 9bdc2c9a7eba84b88293776482f9a0bbcdb55e03b9412d639f3145eb74ceddd2

See more details on using hashes here.

File details

Details for the file git_bayesect-1.1-py3-none-any.whl.

File metadata

  • Download URL: git_bayesect-1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for git_bayesect-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5f7d125e68bbb271fb7f66b08329cd12808e1386e20ee36cd6e324167ef68c7a
MD5 2e847037d83a63d9825003e809d554f4
BLAKE2b-256 58b7e729d31fb0e8bd5d32a3b4aee98a5fe98f26449b7b93c4397add0b33a289

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page