SIB: Self-Improving Benchmark agent framework
Project description
SIB (Self-Improving Benchmark)
A framework for building self-improving AI agents that autonomously refine their performance on benchmark tasks.
Installation
pip install sib-agent
Usage
from sib import Generation, GenerationLog, ScoreTracker
# Track generations in a self-improvement loop
log = GenerationLog()
gen = log.new_generation(config={"model": "claude-sonnet", "temperature": 0.7})
# ... run your agent ...
gen.finish(result={"accuracy": 0.82})
gen = log.new_generation(config={"model": "claude-sonnet", "temperature": 0.5})
# ... run improved agent ...
gen.finish(result={"accuracy": 0.91})
# Find the best generation
best = log.best("accuracy")
print(best.generation_id, best.result) # 1 {'accuracy': 0.91}
# Save and reload logs
log.save("run_log.json")
log = GenerationLog.load("run_log.json")
# Track scores across generations
tracker = ScoreTracker()
tracker.record("accuracy", 0.82)
tracker.record("accuracy", 0.91)
summary = tracker.summarise("accuracy")
print(summary.improvement) # 0.09
print(summary.is_improving) # True
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sib_agent-0.1.0.tar.gz
(3.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sib_agent-0.1.0.tar.gz.
File metadata
- Download URL: sib_agent-0.1.0.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b438fff9a16050fd2bfc20bfac2f1a8b9b7afd826585a2e1a18884349b23f70
|
|
| MD5 |
6a6a790ff47e3c8fe347dbd3cb3d9df7
|
|
| BLAKE2b-256 |
f9854f458af21f9e9b45cde09724df3aef7fbd3a6cb2a8757c098cab6d271ea8
|
File details
Details for the file sib_agent-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sib_agent-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41c8b76410fe8089418927603cbecb8e8f8a6424af01b3d911148189c2f89629
|
|
| MD5 |
f9dac130e4589df854e39b88f201403c
|
|
| BLAKE2b-256 |
e4a9a2c7a9725bc261df0c125f5ec1168580b9cc8f2e27fe69bfea0cf2b758e1
|