Structured payload compaction and manifold-aware analytics for LLM tool outputs
Project description
BinomialHash
Content-addressed, schema-aware structured data compaction for LLM tool outputs.
BinomialHash intercepts large JSON payloads from tool calls, infers schema and statistics, deduplicates by content fingerprint, and returns compact summaries that fit in LLM context windows. Agent tools let the model retrieve, aggregate, query, group, and export data on demand without blowing the token budget.
Install
pip install binomialhash
With exact token counting (OpenAI / xAI):
pip install binomialhash[openai]
All optional dependencies:
pip install binomialhash[all]
Quickstart
import json
from binomialhash import BinomialHash
bh = BinomialHash()
data = [
{"ticker": "AAPL", "price": 189.50, "volume": 54_000_000, "sector": "Technology"},
{"ticker": "MSFT", "price": 378.20, "volume": 28_000_000, "sector": "Technology"},
{"ticker": "JPM", "price": 195.30, "volume": 12_000_000, "sector": "Financials"},
# ... hundreds more rows ...
]
raw = json.dumps(data)
summary = bh.ingest(raw, "market_data")
# If len(raw) > 3000 chars: returns a compact schema + stats summary
# If small: passes through unchanged
# Query stored data
rows = bh.retrieve("market_data_abc123", offset=0, limit=10)
agg = bh.aggregate("market_data_abc123", "price", "mean")
Provider Adapters
BinomialHash ships with 68 provider-neutral tool definitions that expose its full API (retrieve, aggregate, query, group, statistical analysis, causal inference, manifold navigation, spatial reasoning, export, etc.) to any LLM. Adapters translate these into provider-specific formats.
OpenAI
from binomialhash import BinomialHash
from binomialhash.tools import get_all_tools
from binomialhash.adapters.openai import get_openai_tools, handle_openai_tool_call
bh = BinomialHash()
specs = get_all_tools(bh)
tools = get_openai_tools(specs) # Responses API format (default)
# Pass to the API
response = client.responses.create(model="gpt-4o", tools=tools, input=messages)
# Handle function calls
for item in response.output:
if item.type == "function_call":
result = handle_openai_tool_call(specs, item.name, item.arguments)
For Chat Completions (legacy):
tools = get_openai_tools(specs, format="chat_completions")
Anthropic
from binomialhash.adapters.anthropic import get_anthropic_tools, handle_anthropic_tool_use
tools = get_anthropic_tools(specs)
# Pass to client.messages.create(tools=tools, ...)
# Handle tool_use blocks
result = handle_anthropic_tool_use(specs, block.name, block.input)
Google Gemini
from google.genai import types
from binomialhash.adapters.gemini import get_gemini_tools, handle_gemini_tool_call
decls = get_gemini_tools(specs)
gemini_tools = types.Tool(function_declarations=decls)
# Handle function_call parts
result = handle_gemini_tool_call(specs, fc.name, fc.args)
xAI / Grok
from binomialhash.adapters.xai import get_xai_tools, handle_xai_tool_call
tools = get_xai_tools(specs)
# Uses OpenAI-compatible format
Provider Router
from binomialhash.adapters import get_tools_for_provider
tools = get_tools_for_provider(specs, provider="openai")
tools = get_tools_for_provider(specs, provider="anthropic")
Middleware
Auto-intercept large tool outputs without modifying tool functions:
from binomialhash.middleware import bh_intercept, raw_mode
@bh_intercept(label="market_data")
def fetch_data(ticker: str) -> dict:
return huge_json_response # auto-compacted if > 3000 chars
# Bypass interception when you need the raw payload
with raw_mode():
native = fetch_data("AAPL") # returns original dict
Wrapper form for third-party functions:
from binomialhash.middleware import wrap_tool_with_bh
wrapped = wrap_tool_with_bh(third_party_fetch, label="external_data")
Both sync and async functions are supported.
Token Counting
from binomialhash.tokenizers import count_tokens, is_exact
n = count_tokens("Hello world", provider="openai") # exact with tiktoken
n = count_tokens("Hello world", provider="anthropic") # heuristic (chars/4)
if is_exact("openai"):
print("Using tiktoken")
Built-in context stats on every BinomialHash instance:
stats = bh.context_stats()
# {"tool_calls": 5, "chars_in_raw": 120000, "chars_out_to_llm": 8000,
# "compression_ratio": 15.0, "est_tokens_out": 2000, ...}
Package Structure
binomialhash/
core.py # BinomialHash class — ingest, retrieve, aggregate, query
schema.py # Schema inference and column typing
extract.py # Row extraction from nested JSON
predicates.py # Predicate building and row filtering
context.py # Request-scoped contextvar helpers
insights.py # Objective-driven insight extraction
middleware.py # Auto-interception decorator and raw-mode bypass
stats/ # 39 statistical tools across 7 stages
regression.py # OLS, partial correlation, PCA
quality.py # Outlier detection, missing data, distribution tests
dependency.py # Correlation matrices, mutual information, Granger
drivers.py # Feature importance, SHAP-style, interaction screening
structure.py # Clustering, segmentation, latent structure
causal.py # ATE estimation, synthetic control, counterfactuals
dynamics.py # Trend decomposition, change-point, recurrence
laws.py # Benford, Zipf, power-law diagnostics
manifold/ # Manifold surface construction and navigation (14 tools)
spatial.py # 6 spatial reasoning tools (HKS, diffusion, Reeb, etc.)
tools/ # 68 provider-neutral ToolSpec definitions
adapters/ # OpenAI, Anthropic, Gemini, xAI schema translators
exporters/ # Markdown, CSV, Excel, chunked artifacts
tokenizers/ # Provider-aware token counting
Scope and Limitations
BinomialHash is a structured data compaction and analytics engine. Its manifold topology outputs are operational structural diagnostics — not proofs of true underlying manifold topology. The edge-incidence manifoldness gate is implemented; vertex-link validation and combinatorial orientability are on the roadmap.
Development
cd binomialhash
pip install -e ".[dev]"
python -m pytest tests/ -v
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file binomialhash-0.1.2.tar.gz.
File metadata
- Download URL: binomialhash-0.1.2.tar.gz
- Upload date:
- Size: 113.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e2b55b607aab3e87d69f0ecc4d15cf18a198c37f432149584162042aca30e8c
|
|
| MD5 |
ab6c6cab28e698128dde4371803cfd5c
|
|
| BLAKE2b-256 |
2dfd189c102b60918ac749d20bd23ce8aad1fcdb6256ff782e68ef69465d84ed
|
Provenance
The following attestation bundles were made for binomialhash-0.1.2.tar.gz:
Publisher:
publish.yml on Binomial-Capital-Management/binomialhash
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
binomialhash-0.1.2.tar.gz -
Subject digest:
4e2b55b607aab3e87d69f0ecc4d15cf18a198c37f432149584162042aca30e8c - Sigstore transparency entry: 1107809419
- Sigstore integration time:
-
Permalink:
Binomial-Capital-Management/binomialhash@9211ffaf12b9d35c98c785db7309d6cf5a436d66 -
Branch / Tag:
refs/tags/0.1.3 - Owner: https://github.com/Binomial-Capital-Management
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9211ffaf12b9d35c98c785db7309d6cf5a436d66 -
Trigger Event:
release
-
Statement type:
File details
Details for the file binomialhash-0.1.2-py3-none-any.whl.
File metadata
- Download URL: binomialhash-0.1.2-py3-none-any.whl
- Upload date:
- Size: 138.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
222421ea0a806e9c776534952ed7b2a4fd085b7f3ff61bf339ea2357bda7b026
|
|
| MD5 |
76925de4195f2d2b160e3407f66f5e35
|
|
| BLAKE2b-256 |
d9952be27120d2c8c7b734ae33f327651f597bc5f920d4e17d33e77776927819
|
Provenance
The following attestation bundles were made for binomialhash-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on Binomial-Capital-Management/binomialhash
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
binomialhash-0.1.2-py3-none-any.whl -
Subject digest:
222421ea0a806e9c776534952ed7b2a4fd085b7f3ff61bf339ea2357bda7b026 - Sigstore transparency entry: 1107809421
- Sigstore integration time:
-
Permalink:
Binomial-Capital-Management/binomialhash@9211ffaf12b9d35c98c785db7309d6cf5a436d66 -
Branch / Tag:
refs/tags/0.1.3 - Owner: https://github.com/Binomial-Capital-Management
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9211ffaf12b9d35c98c785db7309d6cf5a436d66 -
Trigger Event:
release
-
Statement type: