Python bindings for infiniloom - Repository context engine for LLMs

These details have not been verified by PyPI

Project links

Project description

Infiniloom Python Bindings

Python bindings for Infiniloom - a repository context engine for Large Language Models.

Installation

pip install infiniloom

Building from Source

git clone https://github.com/Topos-Labs/infiniloom.git
cd infiniloom/bindings/python
pip install maturin
maturin develop  # For development
maturin build --release  # For production wheel

Quick Start

Functional API

import infiniloom

# Pack a repository into Claude-optimized XML
context = infiniloom.pack("/path/to/repo", format="xml", model="claude")
print(context)

# Scan repository and get statistics
stats = infiniloom.scan("/path/to/repo")
print(f"Files: {stats['total_files']}")
print(f"Languages: {stats['languages']}")

# Count tokens for a specific model
tokens = infiniloom.count_tokens("Hello, world!", model="claude")
print(f"Tokens: {tokens}")

Object-Oriented API

from infiniloom import Infiniloom

# Create an Infiniloom instance
loom = Infiniloom("/path/to/repo")

# Get repository statistics
stats = loom.stats()
print(stats)

# Generate repository context
context = loom.pack(format="xml", model="claude", compression="balanced")

# Get repository map with important symbols
repo_map = loom.map(map_budget=2000, max_symbols=50)
for symbol in repo_map['key_symbols']:
    print(f"{symbol['name']} ({symbol['kind']}) in {symbol['file']}")

# Scan for security issues
findings = loom.scan_security()
for finding in findings:
    print(f"{finding['severity']}: {finding['message']} at {finding['file']}:{finding['line']}")

# List all files
files = loom.files()
for file in files:
    print(f"{file['path']} - {file['language']} ({file['tokens']} tokens)")

API Reference

Functions

`pack(path, format="xml", model="claude", compression="balanced", map_budget=2000, max_symbols=50)`

Pack a repository into an LLM-optimized format.

Parameters:

path (str): Path to the repository
format (str): Output format - "xml", "markdown", "json", "yaml", "toon", or "plain"
model (str): Target model for token counting. Supports:
- OpenAI GPT-5.x: "gpt-5.2", "gpt-5.2-pro", "gpt-5.1", "gpt-5.1-mini", "gpt-5.1-codex", "gpt-5", "gpt-5-mini", "gpt-5-nano"
- OpenAI O-series: "o4-mini", "o3", "o3-mini", "o1", "o1-mini", "o1-preview"
- OpenAI GPT-4: "gpt-4o", "gpt-4o-mini", "gpt-4", "gpt-3.5-turbo"
- Anthropic: "claude" (default)
- Google: "gemini"
- Meta: "llama", "codellama"
- Others: "deepseek", "mistral", "qwen", "cohere", "grok"
compression (str): Compression level - "none", "minimal", "balanced", "aggressive", "extreme", "focused", "semantic"
map_budget (int): Token budget for repository map (default: 2000)
max_symbols (int): Maximum symbols to include (default: 50)

Returns: str - Formatted repository context

`scan(path, include_hidden=False, respect_gitignore=True)`

Scan a repository and return statistics.

Parameters:

path (str): Path to the repository
include_hidden (bool): Include hidden files (default: False)
respect_gitignore (bool): Respect .gitignore files (default: True)

Returns: dict - Repository statistics including:

name: Repository name
path: Absolute path
total_files: Number of files
total_lines: Total lines of code
total_tokens: Token counts for each model
languages: Language breakdown
branch: Git branch (if available)
commit: Git commit hash (if available)

`count_tokens(text, model="claude")`

Count tokens in text for a specific model.

Parameters:

text (str): Text to count tokens for
model (str): Target model. Supports all models listed above in pack(), including GPT-5.x series

Returns: int - Number of tokens (exact for OpenAI models via tiktoken, calibrated estimates for others)

`semantic_compress(text, similarity_threshold=0.7, budget_ratio=0.5)`

Compress text using semantic compression while preserving important content.

Parameters:

text (str): Text to compress
similarity_threshold (float): Threshold for grouping similar chunks (0.0-1.0, default: 0.7)
budget_ratio (float): Target size as ratio of original (0.0-1.0, default: 0.5)

Returns: str - Compressed text

import infiniloom

long_text = "... your long text content ..."
compressed = infiniloom.semantic_compress(long_text, budget_ratio=0.3)
print(compressed)

`scan_security(path)`

Scan repository for security issues.

Parameters:

path (str): Path to the repository

Returns: list[dict] - List of security findings with:

file: File path
line: Line number
severity: Severity level ("Critical", "High", "Medium", "Low", "Info")
kind: Type of finding (e.g., "aws_access_key", "github_token")
pattern: The matched pattern

`is_git_repo(path)`

Check if a path is a git repository.

Parameters:

path (str): Path to check

Returns: bool - True if path is a git repository, False otherwise

from infiniloom import is_git_repo

if is_git_repo("/path/to/repo"):
    print("This is a git repository")

Classes

`Infiniloom(path)`

Object-oriented interface for repository analysis.

Methods:

`load(include_hidden=False, respect_gitignore=True)`

Load the repository into memory.

`stats()`

Get repository statistics. Returns same structure as scan() function.

`pack(format="xml", model="claude", compression="balanced", map_budget=2000)`

Pack the repository. Returns formatted string.

`map(map_budget=2000, max_symbols=50)`

Get repository map with key symbols. Returns dict with:

summary: Text summary
token_count: Estimated tokens
key_symbols: List of important symbols

`scan_security()`

Scan for security issues. Returns list of findings.

`files()`

Get list of all files. Returns list of dicts with file metadata.

`GitRepo(path)`

Git repository wrapper for accessing git operations like status, diff, log, and blame.

Constructor:

path (str): Path to the git repository

Raises: InfiniloomError if path is not a git repository

Methods:

`current_branch()`

Get the current branch name.

Returns: str - Current branch name (e.g., "main", "feature/xyz")

`current_commit()`

Get the current commit hash.

Returns: str - Full SHA-1 hash of HEAD commit (40 characters)

`status()`

Get working tree status (both staged and unstaged changes).

Returns: list[dict] - List of file status objects with:

path: File path
status: Status type ("Added", "Modified", "Deleted", "Renamed", "Copied", "Unknown")
old_path: Old path for renames (optional)

`log(count=10)`

Get recent commits.

Parameters:

count (int): Maximum number of commits to return (default: 10)

Returns: list[dict] - List of commit objects with:

hash: Full commit hash
short_hash: Short commit hash (7 characters)
author: Author name
email: Author email
date: Commit date (ISO 8601 format)
message: Commit message (first line)

`file_log(path, count=10)`

Get commits that modified a specific file.

Parameters:

path (str): File path relative to repo root
count (int): Maximum number of commits to return (default: 10)

Returns: list[dict] - List of commits that modified the file

`blame(path)`

Get blame information for a file.

Parameters:

path (str): File path relative to repo root

Returns: list[dict] - List of blame line objects with:

commit: Commit hash that introduced the line
author: Author who wrote the line
date: Date when line was written
line_number: Line number (1-indexed)

`ls_files()`

Get list of files tracked by git.

Returns: list[str] - Array of file paths tracked by git

`diff_files(from_ref, to_ref)`

Get files changed between two commits.

Parameters:

from_ref (str): Starting commit/branch/tag
to_ref (str): Ending commit/branch/tag

Returns: list[dict] - List of changed files with:

path: File path
status: Status ("Added", "Modified", "Deleted", "Renamed", "Copied")
additions: Number of lines added
deletions: Number of lines deleted

`uncommitted_diff(path)`

Get diff content for uncommitted changes in a file.

Parameters:

path (str): File path relative to repo root

Returns: str - Unified diff content

`all_uncommitted_diffs()`

Get diff for all uncommitted changes.

Returns: str - Combined unified diff for all changed files

`has_changes(path)`

Check if a file has uncommitted changes.

Parameters:

path (str): File path relative to repo root

Returns: bool - True if file has changes

`last_modified_commit(path)`

Get the last commit that modified a file.

Parameters:

path (str): File path relative to repo root

Returns: dict - Commit information object

`file_change_frequency(path, days=30)`

Get file change frequency in recent days.

Parameters:

path (str): File path relative to repo root
days (int): Number of days to look back (default: 30)

Returns: int - Number of commits that modified the file in the period

Example:

from infiniloom import GitRepo, is_git_repo

# Check if path is a git repo first
if is_git_repo("/path/to/repo"):
    repo = GitRepo("/path/to/repo")

    # Get current state
    print(f"Branch: {repo.current_branch()}")
    print(f"Commit: {repo.current_commit()}")

    # Get recent commits
    for commit in repo.log(count=5):
        print(f"{commit['short_hash']}: {commit['message']}")

    # Get file history
    for commit in repo.file_log("src/main.py", count=3):
        print(f"{commit['date']}: {commit['message']}")

    # Get blame information
    for line in repo.blame("src/main.py")[:10]:
        print(f"Line {line['line_number']}: {line['author']}")

    # Check for uncommitted changes
    if repo.has_changes("src/main.py"):
        diff = repo.uncommitted_diff("src/main.py")
        print(diff)

Formats

XML (Claude-optimized)

Best for Claude models. Uses XML structure that Claude understands well.

context = infiniloom.pack("/path/to/repo", format="xml", model="claude")

Markdown (GPT-optimized)

Best for GPT models. Uses Markdown with clear hierarchical structure.

context = infiniloom.pack("/path/to/repo", format="markdown", model="gpt")

JSON

Generic JSON format for programmatic processing.

context = infiniloom.pack("/path/to/repo", format="json")

YAML (Gemini-optimized)

Best for Gemini. Query should be placed at the end.

context = infiniloom.pack("/path/to/repo", format="yaml", model="gemini")

TOON (Token-Efficient)

Most token-efficient format (~40% smaller than JSON). Best for limited context windows.

context = infiniloom.pack("/path/to/repo", format="toon")

Compression Levels

none: No compression (0% reduction)
minimal: Remove empty lines, trim whitespace (15% reduction)
balanced: Remove comments, normalize whitespace (35% reduction) - Default
aggressive: Remove docstrings, keep signatures only (60% reduction)
extreme: Key symbols only (80% reduction)
focused: Key symbols with small context (75% reduction)
semantic: Heuristic semantic compression (~60-70% reduction)

Integration Examples

With Anthropic Claude

import infiniloom
import anthropic

# Generate context
context = infiniloom.pack(
    "/path/to/repo",
    format="xml",
    model="claude",
    compression="balanced"
)

# Send to Claude
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    messages=[{
        "role": "user",
        "content": f"{context}\n\nExplain the architecture of this codebase."
    }]
)
print(response.content[0].text)

With OpenAI GPT

import infiniloom
import openai

context = infiniloom.pack("/path/to/repo", format="markdown", model="gpt")

client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": f"{context}\n\nWhat are the main components?"
    }]
)
print(response.choices[0].message.content)

With Google Gemini

import infiniloom
import google.generativeai as genai

context = infiniloom.pack("/path/to/repo", format="yaml", model="gemini")

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-1.5-pro")
response = model.generate_content(f"{context}\n\nSummarize this codebase")
print(response.text)

Advanced Usage

Custom Token Budget

from infiniloom import Infiniloom

loom = Infiniloom("/large/repo")

# Generate smaller context for models with limited context windows
compact_map = loom.map(map_budget=1000, max_symbols=25)

# Generate larger context for models with large context windows
detailed_map = loom.map(map_budget=5000, max_symbols=200)

Security Scanning

from infiniloom import Infiniloom

loom = Infiniloom("/path/to/repo")
findings = loom.scan_security()

# Filter by severity
critical = [f for f in findings if f['severity'] == 'Critical']
high = [f for f in findings if f['severity'] == 'High']

print(f"Critical: {len(critical)}, High: {len(high)}")

for finding in critical:
    print(f"{finding['file']}:{finding['line']}")
    print(f"  {finding['category']}: {finding['message']}")

File Filtering

from infiniloom import Infiniloom

loom = Infiniloom("/path/to/repo")
files = loom.files()

# Get Python files only
python_files = [f for f in files if f['language'] == 'python']

# Get high-importance files
important_files = [f for f in files if f['importance'] > 0.7]

# Get large files
large_files = [f for f in files if f['tokens'] > 1000]

Performance

Infiniloom is built in Rust for maximum performance:

Fast scanning: Parallel file processing with ignore patterns
Memory efficient: Streaming processing, optional content loading
Native speed: No Python overhead for core operations

Requirements

Python 3.8+
Rust 1.91+ (for building from source)

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.0

Mar 15, 2026

0.6.3

Feb 1, 2026

0.6.2

Jan 9, 2026

0.6.1

Jan 4, 2026

0.6.0

Jan 2, 2026

0.5.4

Dec 30, 2025

0.5.3

Dec 30, 2025

0.5.2

Dec 29, 2025

0.5.1

Dec 29, 2025

0.5.0

Dec 29, 2025

0.4.11

Dec 27, 2025

0.4.10

Dec 26, 2025

0.4.8

Dec 26, 2025

0.4.7

Dec 25, 2025

0.4.6

Dec 25, 2025

0.4.5

Dec 25, 2025

0.4.4

Dec 25, 2025

0.4.3

Dec 25, 2025

0.4.1

Dec 25, 2025

0.4.0

Dec 24, 2025

0.3.4

Dec 24, 2025

0.3.3

Dec 24, 2025

0.3.2

Dec 23, 2025

0.3.1

Dec 23, 2025

This version

0.3.0

Dec 23, 2025

0.1.0

Dec 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infiniloom-0.3.0.tar.gz (253.5 kB view details)

Uploaded Dec 23, 2025 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

infiniloom-0.3.0-cp38-abi3-win_amd64.whl (7.7 MB view details)

Uploaded Dec 23, 2025 CPython 3.8+Windows x86-64

infiniloom-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded Dec 23, 2025 CPython 3.8+manylinux: glibc 2.17+ x86-64

infiniloom-0.3.0-cp38-abi3-macosx_11_0_arm64.whl (7.9 MB view details)

Uploaded Dec 23, 2025 CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file infiniloom-0.3.0.tar.gz.

File metadata

Download URL: infiniloom-0.3.0.tar.gz
Upload date: Dec 23, 2025
Size: 253.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for infiniloom-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`de0c4078eeee7c2d3c6d2981ed6610df75d96bb20a9d30c6cf403c272a031399`
MD5	`75fbaa1b5fcf7e0aa5d0653d6f611c24`
BLAKE2b-256	`5affe80d2dbf0f2a0c7b71c53b8657201e5a63183cc22e851393c0f8ebbee440`

See more details on using hashes here.

File details

Details for the file infiniloom-0.3.0-cp38-abi3-win_amd64.whl.

File metadata

Download URL: infiniloom-0.3.0-cp38-abi3-win_amd64.whl
Upload date: Dec 23, 2025
Size: 7.7 MB
Tags: CPython 3.8+, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for infiniloom-0.3.0-cp38-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`c5e600cb53c88662a17703a990ae69ab2c65c3c0e334ba6e9cf92ccf9230c579`
MD5	`34f793a3c8771ac34b152102130d8ba2`
BLAKE2b-256	`c47adcf2860c96a0f427bcf9592fc32410dff76e6cb8f2417f0e9c99e4ce86e6`

See more details on using hashes here.

File details

Details for the file infiniloom-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: infiniloom-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Dec 23, 2025
Size: 7.8 MB
Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for infiniloom-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`72b12918ba115d39f09aa8cdfadec915b8e0efc91b81b7ce3df8b51406f799fa`
MD5	`36478ea1e2261efca722c6df98315dcb`
BLAKE2b-256	`10e31e7818a113fe5a67866693e085cba5f6c9af65c61b5c498d7271d997e1b8`

See more details on using hashes here.

File details

Details for the file infiniloom-0.3.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: infiniloom-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
Upload date: Dec 23, 2025
Size: 7.9 MB
Tags: CPython 3.8+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for infiniloom-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`6088acd4fc340168fc29a63e0f28c84cb59fd53e519f3014d7c6855636af4702`
MD5	`f914fa3400fb7ec19637ad8056506bcc`
BLAKE2b-256	`43be5d3514e6b638540e84c3bbeb024b38ef6192d6aa4199e8d2af7859b61919`

See more details on using hashes here.

infiniloom 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Infiniloom Python Bindings

Installation

Building from Source

Quick Start

Functional API

Object-Oriented API

API Reference

Functions

pack(path, format="xml", model="claude", compression="balanced", map_budget=2000, max_symbols=50)

scan(path, include_hidden=False, respect_gitignore=True)

count_tokens(text, model="claude")

semantic_compress(text, similarity_threshold=0.7, budget_ratio=0.5)

scan_security(path)

is_git_repo(path)

Classes

Infiniloom(path)

load(include_hidden=False, respect_gitignore=True)

stats()

pack(format="xml", model="claude", compression="balanced", map_budget=2000)

map(map_budget=2000, max_symbols=50)

scan_security()

files()

GitRepo(path)

current_branch()

current_commit()

status()

log(count=10)

file_log(path, count=10)

blame(path)

ls_files()

diff_files(from_ref, to_ref)

uncommitted_diff(path)

all_uncommitted_diffs()

has_changes(path)

last_modified_commit(path)

file_change_frequency(path, days=30)

Formats

XML (Claude-optimized)

Markdown (GPT-optimized)

JSON

YAML (Gemini-optimized)

TOON (Token-Efficient)

Compression Levels

Integration Examples

With Anthropic Claude

With OpenAI GPT

With Google Gemini

Advanced Usage

Custom Token Budget

Security Scanning

File Filtering

Performance

Requirements

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

`pack(path, format="xml", model="claude", compression="balanced", map_budget=2000, max_symbols=50)`

`scan(path, include_hidden=False, respect_gitignore=True)`

`count_tokens(text, model="claude")`

`semantic_compress(text, similarity_threshold=0.7, budget_ratio=0.5)`

`scan_security(path)`

`is_git_repo(path)`

`Infiniloom(path)`

`load(include_hidden=False, respect_gitignore=True)`

`stats()`

`pack(format="xml", model="claude", compression="balanced", map_budget=2000)`

`map(map_budget=2000, max_symbols=50)`

`scan_security()`

`files()`

`GitRepo(path)`

`current_branch()`

`current_commit()`

`status()`

`log(count=10)`

`file_log(path, count=10)`

`blame(path)`

`ls_files()`

`diff_files(from_ref, to_ref)`

`uncommitted_diff(path)`

`all_uncommitted_diffs()`

`has_changes(path)`

`last_modified_commit(path)`

`file_change_frequency(path, days=30)`