Bit-calibrated test-time scaling for quantized reasoning models
Project description
BitCal-TTS
Bit-Calibrated Test-Time Scaling for Quantized Reasoning Models
Lightweight, model-agnostic control loop for budgeted reasoning with quantized LLMs: online uncertainty signals, bit-aware confidence calibration, and continue / stop / escalate decisions—without retraining the base model.
Current release: v0.1.0 (research / alpha). Install from PyPI after the first upload, or from this repository (see below). Maintainer notes: RELEASING.md.
For new users (quick checklist)
| Step | Command | What success looks like |
|---|---|---|
| 1. Get the code | Clone or pip install from Git (below) |
You have bitcal_tts importable |
| 2. Install deps | pip install -e ".[dev,research]" from repo root |
No install errors |
| 3. Verify | bitcal-tts doctor |
Prints Python, torch, transformers, PyYAML versions |
| 4. Run demo | bitcal-tts demo --max-steps 2 |
Prints steps, metrics, halting actions |
| 5. Run tests | python -m pytest tests/ -q --no-cov |
passed (or use default pytest for coverage gate) |
If all five pass on your machine, your environment matches what we test in CI (Ubuntu, Python 3.10–3.13).
Requirements
- Python 3.10, 3.11, 3.12, or 3.13
- OS: Linux, macOS, or Windows
- Hardware: CPU is enough for tests and the mock demo; a GPU is optional for real model runs (
hf-smoke, future experiments) - Disk / network: optional Hugging Face commands download model weights on first use
Installation
Option A — PyPI (after the first upload)
pip install "bitcal-tts[research]"
Library only (no optional research stack):
pip install bitcal-tts
Option B — Clone (recommended for development)
git clone https://github.com/Saibabu7770/bitcal-tts.git
cd bitcal-tts
python -m venv .venv
Activate the venv:
- Linux / macOS:
source .venv/bin/activate - Windows (PowerShell):
.\.venv\Scripts\Activate.ps1 - Windows (cmd):
.\.venv\Scripts\activate.bat
Then:
python -m pip install --upgrade pip
pip install -e ".[dev,research]"
Option C — Install from GitHub without cloning (pip)
Install the package directly (non-editable):
pip install "bitcal-tts[dev,research] @ git+https://github.com/Saibabu7770/bitcal-tts.git"
Minimal (library + tests only):
pip install "bitcal-tts[dev] @ git+https://github.com/Saibabu7770/bitcal-tts.git"
Option D — Flat requirements.txt
From a clone:
pip install -r requirements.txt
For development you still should install the package in editable mode: pip install -e ".[dev,research]".
Quick start
Mock demo (no GPU, no model download):
python -m bitcal_tts demo
# or
bitcal-tts demo --config configs/default.yaml
Environment check:
bitcal-tts doctor
Optional — Hugging Face smoke test (downloads a small model; needs [research]):
bitcal-tts hf-smoke --model gpt2 --prompt "Hello"
On a machine with CUDA and a proper PyTorch build, hf-smoke can use the GPU automatically.
Legacy entry point (same as demo):
python scripts/run_baseline_demo.py --config configs/default.yaml
Testing
Run the full suite without the coverage gate (fast):
python -m pytest tests/ -q --no-cov
Run with coverage (same as default pytest in this repo; enforces ≥90% line coverage on bitcal_tts):
python -m pytest tests/ -q
Project layout
bitcal-tts/
src/bitcal_tts/ # Package: runner, signals, calibrator, policy, eval, integrations, CLI
configs/ # YAML experiment templates
benchmarks/ # JSONL task loader + example tasks
scripts/ # Convenience runners
tests/ # Pytest suite (CPU-safe)
results/ # Local experiment outputs (gitignored except .gitkeep)
.github/workflows/ # CI
Configuration
Edit configs/default.yaml for token budgets, policy thresholds, and calibrator settings (bit width, temperature). The demo merges CLI flags with YAML when --config is passed.
Troubleshooting
| Problem | What to try |
|---|---|
git push asks for password and fails |
GitHub requires a Personal Access Token (or SSH), not your account password. See GitHub docs on HTTPS. |
transformers / model download errors |
Install extras: pip install -e ".[research]". Check network and disk space. |
bitsandbytes / 4-bit load fails |
Optional dependency; install per bitsandbytes for your OS/GPU. Not required for tests. |
| Tests fail only with coverage | Run pytest tests/ --no-cov first. If that passes, failures are coverage-related or environment-specific. |
| CUDA not seen | Install the PyTorch build that matches your CUDA from pytorch.org. bitcal-tts doctor shows cuda available: True/False. |
Why BitCal-TTS?
Quantized reasoning models are efficient but confidence signals used for adaptive inference (entropy, trace stability, hidden-state agreement) can be miscalibrated relative to full precision. Under a fixed token budget, that hurts accuracy–efficiency tradeoffs.
BitCal-TTS adds quantization-aware calibration on top of standard signals so halting respects effective precision (e.g., 4-bit vs 8-bit vs 16-bit), in a reproducible open-source pipeline.
Features
| Component | Description |
|---|---|
| Signals | Token entropy, reasoning-trace stability, optional hidden-state stability |
| Calibration | Bit-aware confidence mapping (more conservative at lower effective precision) |
| Policy | Halting: continue, stop, escalate |
| Evaluation | Trace summaries and halting metrics (tokens, escalations, efficiency) |
| Integration | Optional Hugging Face forward pass (hf-smoke) |
Core tests run on CPU with mocks; large-model experiments are optional.
Roadmap
- First paper milestone: docs/MINIMAL_EXPERIMENT.md — one model, one benchmark (e.g. GSM8K subset), three methods, ~8 GB VRAM
- Baseline sweeps on public reasoning benchmarks
- Optional vLLM / server-style integration
- Paper + artifact bundle when results meet
PROJECT_PLAN.mdcriteria
Contributing
See CONTRIBUTING.md.
Security
See SECURITY.md.
Citation
If you use this code in research:
@software{bitcal_tts2026,
title = {BitCal-TTS: Bit-Calibrated Test-Time Scaling for Quantized Reasoning Models},
year = {2026},
url = {https://github.com/Saibabu7770/bitcal-tts},
note = {Open-source research implementation}
}
License
Copyright
Copyright (c) 2026, Sai Babu All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bitcal_tts-0.1.0.tar.gz.
File metadata
- Download URL: bitcal_tts-0.1.0.tar.gz
- Upload date:
- Size: 29.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
056ecc00d601af513ca7a0a5104c63e6db73b83d2861154f6c483458c4848a62
|
|
| MD5 |
765250afcc445939bc05bce4d59b6066
|
|
| BLAKE2b-256 |
23ddca985255076626c2864c8230fa4b1777224705151270c6b4de5202d2121f
|
Provenance
The following attestation bundles were made for bitcal_tts-0.1.0.tar.gz:
Publisher:
publish-pypi.yml on Saibabu7770/bitcal-tts
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bitcal_tts-0.1.0.tar.gz -
Subject digest:
056ecc00d601af513ca7a0a5104c63e6db73b83d2861154f6c483458c4848a62 - Sigstore transparency entry: 1281007910
- Sigstore integration time:
-
Permalink:
Saibabu7770/bitcal-tts@6f6ee8a22ba28812132372fd10086238c1bcd26d -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Saibabu7770
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@6f6ee8a22ba28812132372fd10086238c1bcd26d -
Trigger Event:
push
-
Statement type:
File details
Details for the file bitcal_tts-0.1.0-py3-none-any.whl.
File metadata
- Download URL: bitcal_tts-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7209f9e4b1bd70f4ff7e9d897c6661027c812e676e383923b0b28a1818612851
|
|
| MD5 |
d542599fb47c22a725bcbac305cbb516
|
|
| BLAKE2b-256 |
044c8d878c25d6ad18a920b18d4bd1b2a075cbc7097062cc3e85a3cfc122139b
|
Provenance
The following attestation bundles were made for bitcal_tts-0.1.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on Saibabu7770/bitcal-tts
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bitcal_tts-0.1.0-py3-none-any.whl -
Subject digest:
7209f9e4b1bd70f4ff7e9d897c6661027c812e676e383923b0b28a1818612851 - Sigstore transparency entry: 1281007928
- Sigstore integration time:
-
Permalink:
Saibabu7770/bitcal-tts@6f6ee8a22ba28812132372fd10086238c1bcd26d -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Saibabu7770
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@6f6ee8a22ba28812132372fd10086238c1bcd26d -
Trigger Event:
push
-
Statement type: