Information-theoretic context optimization for AI coding agents. Knapsack-optimal token budgeting, Shannon entropy scoring, SimHash dedup, predictive pre-fetch. MCP server.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ashuabhi

These details have not been verified by PyPI

Project links

Full Framework

Project description

Entroly Logo

Entroly

Information-theoretic context compression for AI coding agents.

Rust Python License PRs Welcome Tests

Every AI coding tool — Cursor, Copilot, Claude Code, Cody — manages context with dumb heuristics: stuff tokens until the window fills, then cut. Entroly uses mathematics to compress an entire codebase into the optimal context window.

🛑 The Problem

Current AI tools use Cosine Similarity (Vector Search). It's great for finding "things that look like my query," but terrible for coding because:

Context Blindness: It finds the "top 5 files" but missed the 6th file that contains the critical interface definition.
Boilerplate Waste: 40% of retrieved code is often imports or repetitive boilerplate, wasting expensive tokens.
Correlation vs Causation: Vector search finds related code, not causally necessary code.

✅ The Solution: Entroly

Entroly replaces "dumb search" with Information-Theoretic Compression. It treats your context window as a finite resource and uses Knapsack Optimization to pack the most "informative" (highest entropy) and "causally relevant" (dependency-linked) fragments.

pip install entroly

How It's Different

Sourcegraph Cody does search: "Find 5–10 files that look relevant." Entroly does compression: "Show the LLM the entire codebase at variable resolution."

	Cody / Copilot	Entroly
Approach	Embedding similarity search	Information-theoretic compression
Coverage	5–10 files (the rest is invisible)	100% codebase visible via 3-level hierarchy
Selection	Top-K by cosine distance	Knapsack-optimal with submodular diversity
Dedup	None	SimHash + LSH in O(1)
Learning	Static	Online Wilson-score feedback + autotune
Security	None	Built-in SAST (55 rules, taint-aware)
Temperature	User-set or model default	Auto-calibrated via Fisher information

Architecture

Hybrid Rust + Python. All math runs in Rust via PyO3 (50–100× faster). MCP protocol and orchestration run in Python. Pure Python fallbacks activate automatically if the Rust extension isn't available.

┌─────────────────────────────────────────────────────────┐
│  IDE (Cursor / Claude Code / Cline / Copilot)           │
│                                                         │
│  ┌──── MCP mode ────┐    ┌──── Proxy mode ────┐        │
│  │ entroly MCP server│    │ localhost:9377     │        │
│  │ (JSON-RPC stdio)  │    │ (HTTP reverse proxy)│       │
│  └────────┬──────────┘    └────────┬───────────┘        │
│           │                        │                    │
│  ┌────────▼────────────────────────▼───────────┐        │
│  │          Entroly Engine (Python)             │        │
│  │  ┌─────────────────────────────────────┐     │       │
│  │  │  entroly-core (Rust via PyO3)       │     │       │
│  │  │  14 modules · 330 KB · 93 tests     │     │       │
│  │  └─────────────────────────────────────┘     │       │
│  └──────────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────────┘

Two deployment modes:

MCP Server — IDE calls remember_fragment, optimize_context, etc. via MCP protocol
Prompt Compiler Proxy — invisible HTTP proxy at localhost:9377, intercepts every LLM request and auto-optimizes (zero IDE changes beyond API base URL)

Engines

Rust Core (14 modules)

Module	What	How
hierarchical.rs	3-level codebase compression (ECC)	L1: skeleton map of ALL files · L2: dep-graph cluster expansion · L3: knapsack-optimal full fragments with submodular diversity
knapsack.rs	Optimal context subset selection	0/1 Knapsack DP (N ≤ 2000) · greedy with Dantzig 0.5-guarantee (N > 2000)
entropy.rs	Information density scoring	Shannon entropy (40%) + boilerplate detection (30%) + cross-fragment n-gram redundancy (30%)
depgraph.rs	Dependency graph + symbol table	Auto-linking: imports (1.0) · type refs (0.9) · function calls (0.7) · same-module (0.5)
skeleton.rs	AST-lite code skeleton extraction	Preserves signatures, class/struct/trait layouts, strips bodies → 60–80% token reduction
dedup.rs	Near-duplicate detection	64-bit SimHash fingerprints · Hamming threshold ≤ 3 · 4-band LSH buckets
lsh.rs	Semantic recall index	12-table multi-probe LSH · 10-bit sampling · ~3 μs over 100K fragments
sast.rs	Static Application Security Testing	55 rules across 8 CWE categories · taint-flow analysis · severity scoring
health.rs	Codebase health analysis	Clone detection (Type-1/2/3) · dead symbol finder · god file detector · arch violation checker
guardrails.rs	Safety-critical file pinning	Criticality levels (Safety/Critical/Important/Normal) · task-aware budget multipliers
prism.rs	Spectral weight optimizer	Jacobi eigendecomposition on 4×4 covariance matrix · anisotropic gain adaptation
query.rs	Query analysis + refinement	Vagueness scoring · keyword extraction · intent classification
fragment.rs	Core data structure	Content, metadata, scoring dimensions, skeleton, SimHash fingerprint
lib.rs	PyO3 bridge + orchestrator	All modules exposed to Python · 93 tests

Python Layer

Module	What
proxy.py	Invisible HTTP reverse proxy (prompt compiler mode)
proxy_transform.py	Request parsing · context formatting (flat + hierarchical) · EGTC · APA
proxy_config.py	Model context windows · all feature flags · autotune overlay
server.py	MCP server with 10+ tools · pure Python fallbacks
long_term_memory.py	Cross-session memory via hippocampus-sharp-memory integration
multimodal.py	Image OCR · diagram parsing (Mermaid/PlantUML/DOT) · voice transcript extraction
autotune.py	Autonomous hyperparameter optimization (mutate → evaluate → keep/discard)
auto_index.py	File-system crawler for automatic codebase indexing
adaptive_pruner.py	Online RL-based fragment pruning
checkpoint.py	Gzipped JSON state serialization (~100 KB per checkpoint)
prefetch.py	Predictive context pre-loading via import analysis + co-access patterns
provenance.py	Hallucination risk detection via source verification + confidence scoring

Novel Algorithms

Entropic Context Compression (ECC)

Three-level hierarchical codebase compression. The LLM sees everything at variable resolution:

graph TD
    Query["User Query"] --> L1["L1: Skeleton Map · 5%"]
    Query --> L2["L2: Dependency Cluster · 25%"]
    Query --> L3["L3: Full Fragments · 70%"]

    L1 --> C1["Signatures of all files"]
    L2 --> C2["Expanded skeletons of related modules"]
    L3 --> C3["Submodular diversified full code"]

    C1 --> Context["Optimal Context Window"]
    C2 --> Context
    C3 --> Context

Novel techniques:

Symbol-reachability slicing — BFS through dep graph from query-relevant symbols (cf. NeurIPS 2025)
Submodular diversity selection — diminishing returns per module (Nemhauser 1978, 1-1/e guarantee)
PageRank centrality — hub files get priority in L2 expansion
Entropy-gated budget allocation — complex codebases get more L3 budget

EGTC v2 (Entropy-Gap Temperature Calibration)

Automatically derives the optimal LLM sampling temperature from information-theoretic properties of the selected context. Uses Fisher information scaling with 4 signals:

τ = clip(τ_base + Σ signal_weights × [vagueness, entropy_gap, sufficiency, task_type])

APA (Adaptive Prompt Augmentation)

Calibrated token estimation — per-language chars/token ratios (Python: 3.0, Rust: 3.5, ...)
Task-aware preamble — conditional hints from security findings, vagueness, and task type
Content deduplication — MD5 hash-based dedup saves 10–20% in multi-turn sessions

Setup

Cursor

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "entroly": {
      "command": "entroly"
    }
  }
}

Claude Code

claude mcp add entroly -- entroly

Prompt Compiler Proxy (any IDE)

Change your IDE's API base URL to http://localhost:9377:

entroly --proxy
# or
ENTROLY_PROXY_PORT=9377 python -m entroly.proxy

Every LLM request is intercepted, optimized with the full pipeline (ECC + EGTC + APA + SAST), and forwarded transparently. < 10ms overhead.

Docker (cross-platform)

docker pull ghcr.io/juyterman1000/entroly:latest
docker run --rm -p 9377:9377 ghcr.io/juyterman1000/entroly:latest

Multi-arch image: linux/amd64 and linux/arm64 (Apple Silicon, AWS Graviton).

MCP Tools

Tool	Purpose
`remember_fragment`	Store context with auto-dedup, entropy scoring, dep linking, criticality detection
`optimize_context`	Select optimal context subset for a token budget (knapsack + ECC)
`recall_relevant`	Sub-linear semantic recall via multi-probe LSH
`record_outcome`	Feed the Wilson-score feedback loop
`explain_context`	Per-fragment scoring breakdown with sufficiency analysis
`checkpoint_state`	Save full session state (gzipped JSON)
`resume_state`	Restore from checkpoint
`prefetch_related`	Predict and pre-load likely-needed context
`get_stats`	Session statistics and cost savings
`health_check`	Clone detection, dead symbols, god files, arch violations

The Math

Multi-Dimensional Relevance Scoring

r(f) = (w_rec · recency + w_freq · frequency + w_sem · semantic + w_ent · entropy)
       / (w_rec + w_freq + w_sem + w_ent)
       × feedback_multiplier

Recency: Ebbinghaus forgetting curve — exp(-ln(2) × Δt / half_life)
Frequency: Normalized access count with spaced repetition boost
Semantic similarity: SimHash Hamming distance to query, normalized to [0, 1]
Information density: Shannon entropy + boilerplate + redundancy

Knapsack Context Selection

Maximize:   Σ r(fᵢ) · x(fᵢ)     for selected fragments
Subject to: Σ c(fᵢ) · x(fᵢ) ≤ B  (token budget)

N ≤ 2000: Exact DP with budget quantization — O(N × 1000)
N > 2000: Greedy density sort — O(N log N), Dantzig 0.5-optimality guarantee

SAST Security Categories

Category	CWE	Rules
Hardcoded Secrets	CWE-798	API keys, passwords, tokens, private keys
SQL Injection	CWE-89	f-strings, concatenation, raw queries (taint-aware)
Command Injection	CWE-78	os.system, subprocess with shell=True
Path Traversal	CWE-22	open() with user input, os.path.join
XSS	CWE-79	innerHTML, template injection
SSRF	CWE-918	requests with user-controlled URLs
Insecure Crypto	CWE-327	MD5/SHA1 for auth, weak key sizes
Auth Flaws	CWE-287	Hardcoded roles, missing auth checks

Configuration

EntrolyConfig(
    default_token_budget=128_000,     # GPT-4 Turbo equivalent
    max_fragments=10_000,             # session fragment cap
    weight_recency=0.30,              # scoring weights (sum to 1.0)
    weight_frequency=0.25,
    weight_semantic_sim=0.25,
    weight_entropy=0.20,
    decay_half_life_turns=15,         # Ebbinghaus half-life
    enable_hierarchical_compression=True,  # 3-level ECC
    enable_temperature_calibration=True,   # EGTC v2
    enable_prompt_directives=True,         # APA preamble
    enable_security_scan=True,             # SAST
)

References

Shannon (1948) — Information Theory
Charikar (2002) — SimHash
Ebbinghaus (1885) — Forgetting Curve
Weiser (1981) — Program Slicing
Nemhauser, Wolsey & Fisher (1978) — Submodular Maximization
Dantzig (1957) — Greedy Knapsack Approximation
Wilson (1927) — Score Confidence Intervals
LLMLingua (EMNLP 2023) — Perplexity-based Token Compression
LongLLMLingua (ACL 2024) — Query-aware Context Compression
RepoFormer (ICML 2024 Oral) — Selective Retrieval for Repo-Level Code
FILM-7B (NeurIPS 2024) — Structure-First Layout
CodeSage (ICLR 2024) — Code Embedding Representation Learning
SWE-bench (ICLR 2024) / SWE-agent (NeurIPS 2024) — Evaluation

Part of the Ebbiforge Ecosystem

Entroly integrates with hippocampus-sharp-memory for persistent cross-session memory and Ebbiforge for TF embeddings and RL weight learning. Both are optional.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ashuabhi

These details have not been verified by PyPI

Project links

Full Framework

Release history Release notifications | RSS feed

0.8.5

Apr 23, 2026

0.8.4

Apr 23, 2026

0.8.1

Apr 14, 2026

0.7.0

Apr 12, 2026

0.6.5

Apr 11, 2026

0.6.4

Apr 5, 2026

0.6.3

Apr 3, 2026

0.6.2

Mar 29, 2026

0.6.1

Mar 29, 2026

0.6.0

Mar 29, 2026

0.5.4

Mar 15, 2026

0.5.3

Mar 15, 2026

0.5.2

Mar 15, 2026

0.5.1

Mar 15, 2026

0.5.0

Mar 15, 2026

0.4.9

Mar 15, 2026

This version

0.4.8

Mar 15, 2026

0.4.7

Mar 15, 2026

0.4.6

Mar 15, 2026

0.4.5

Mar 15, 2026

0.4.4

Mar 15, 2026

0.4.3

Mar 13, 2026

0.4.2

Mar 13, 2026

0.4.1

Mar 13, 2026

0.4.0

Mar 11, 2026

0.3.2

Mar 15, 2026

0.3.1

Mar 7, 2026

0.3.0

Mar 7, 2026

0.2.1

Mar 7, 2026

0.2.0

Mar 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

entroly-0.4.8.tar.gz (3.5 MB view details)

Uploaded Mar 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

entroly-0.4.8-py3-none-any.whl (185.8 kB view details)

Uploaded Mar 15, 2026 Python 3

File details

Details for the file entroly-0.4.8.tar.gz.

File metadata

Download URL: entroly-0.4.8.tar.gz
Upload date: Mar 15, 2026
Size: 3.5 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for entroly-0.4.8.tar.gz
Algorithm	Hash digest
SHA256	`6e3a97e62b5032c7dab8479e2553ae94804ebc530e27217d032f4da687ab96a3`
MD5	`e862ce75d7482de778221b0733325913`
BLAKE2b-256	`c479e1ef88ea1d81842331044cf34b81f7bfdb5575da460b9a63ad5886da409b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for entroly-0.4.8.tar.gz:

Publisher: entroly-publish.yml on juyterman1000/entroly

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: entroly-0.4.8.tar.gz
- Subject digest: 6e3a97e62b5032c7dab8479e2553ae94804ebc530e27217d032f4da687ab96a3
- Sigstore transparency entry: 1108310841
- Sigstore integration time: Mar 15, 2026
Source repository:
- Permalink: juyterman1000/entroly@142332e50fcf08172b72ff615594b090063f9e07
- Branch / Tag: refs/tags/entroly-v0.4.8
- Owner: https://github.com/juyterman1000
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: entroly-publish.yml@142332e50fcf08172b72ff615594b090063f9e07
- Trigger Event: push

File details

Details for the file entroly-0.4.8-py3-none-any.whl.

File metadata

Download URL: entroly-0.4.8-py3-none-any.whl
Upload date: Mar 15, 2026
Size: 185.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for entroly-0.4.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eb9aef247ba235645424c5818c41712290717932fce0f51ef5808007420b7b5a`
MD5	`64a0fdbd5bc888f0e9d8b11b36cb2a7f`
BLAKE2b-256	`2d71b520dd027f3fbd1af393bc98111879b2d7c8e15e557549131b0929a0e42a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for entroly-0.4.8-py3-none-any.whl:

Publisher: entroly-publish.yml on juyterman1000/entroly

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: entroly-0.4.8-py3-none-any.whl
- Subject digest: eb9aef247ba235645424c5818c41712290717932fce0f51ef5808007420b7b5a
- Sigstore transparency entry: 1108310871
- Sigstore integration time: Mar 15, 2026
Source repository:
- Permalink: juyterman1000/entroly@142332e50fcf08172b72ff615594b090063f9e07
- Branch / Tag: refs/tags/entroly-v0.4.8
- Owner: https://github.com/juyterman1000
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: entroly-publish.yml@142332e50fcf08172b72ff615594b090063f9e07
- Trigger Event: push

entroly 0.4.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Entroly

🛑 The Problem

✅ The Solution: Entroly

How It's Different

Architecture

Engines

Rust Core (14 modules)

Python Layer

Novel Algorithms

Entropic Context Compression (ECC)

EGTC v2 (Entropy-Gap Temperature Calibration)

APA (Adaptive Prompt Augmentation)

Setup

Cursor

Claude Code

Prompt Compiler Proxy (any IDE)

Docker (cross-platform)

MCP Tools

The Math

Multi-Dimensional Relevance Scoring

Knapsack Context Selection

SAST Security Categories

Configuration

References

Part of the Ebbiforge Ecosystem

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance