Skip to main content

Semantic Integrity and Orchestration Framework - AI-native Python toolkit for maintaining codebase integrity

Project description


PyPI version Python Version License: MIT Tests Code style: black Typing

SIOF (Semantic Integrity and Orchestration Framework) is the fundamental toolkit for AI-native Python development.

It provides:

  • Data Transformation Graph (DTG) indexing - Map your codebase as data lineage, not control flow
  • AI slop detection - Deterministic pattern matching for machine-generated anti-patterns
  • MCP graph server - Expose your codebase to LLM agents via Model Context Protocol
  • Developer intent extraction (Memex) - Preserve architectural reasoning across AI-generated mutations
  • Sustainability tracking (Green Guard) - Monitor energy consumption and enforce carbon thresholds

Installation

pip install siof

Quick Start

Index Your Repository

siof index build --repo /path/to/repo

Detect AI-Generated Anti-Patterns

siof slop audit --repo /path/to/repo
siof slop fix --repo /path/to/repo

Start MCP Server for AI Agents

siof mcp serve --db siof.db

Python API

from siof.orchestrator import SIOFOrchestrator

# Run complete pipeline
orch = SIOFOrchestrator(repo=".", db_path="siof.db")
result = orch.run_full_pipeline(
    index_mode="build",
    slop_mode="audit",
    enable_memex=True,
    enable_green_guard=True,
)

print(f"Success: {result.success}")
print(f"Duration: {result.total_duration_s:.2f}s")

Core Features

1. DTG Indexer

Parses Python repositories into Data Transformation Graphs, mapping data lineage instead of control flow:

from siof.indexer import PythonIndexer

indexer = PythonIndexer(repo=".", db_path="siof.db")
indexer.init()
result = indexer.build()
print(f"Indexed {result['nodes']} nodes and {result['edges']} edges")

2. De-Slopper Engine

Detects and fixes AI-generated code anti-patterns:

  • NakedExceptionPass - Bare except: pass blocks that swallow errors
  • BroadExceptionPass - Overly broad exception handlers
  • HedgeComment - LLM-generated hedge words ("robust", "comprehensive")
  • EchoComment - Comments that merely restate code mechanics
  • SuspiciousImport - Hallucinated dependencies
  • UnusedImport - Dead imports
from siof.deslopper import DeSlopper

deslopper = DeSlopper(repo=".", db_path="siof.db")
result = deslopper.run(mode="fix")  # audit, fix, or strict
print(f"Found {len(result.findings)} issues")

3. MCP Graph Server

Exposes your DTG to LLM agents via Model Context Protocol:

from siof.mcp_server import MCPGraphServer

server = MCPGraphServer("siof.db")
# Provides tools: find_data_lineage, impact_of_change, get_dead_paths, etc.

Features:

  • RBAC with role hierarchy (viewer/analyst/admin/service)
  • Rate limiting per role and organization
  • Distributed tracing with trace IDs
  • Schema validation for all tool inputs

4. Memex Intent Layer

Extracts and preserves developer intent from commits, PRs, and prompts:

from siof.memex import Memex

memex = Memex(repo=".", db_path="siof.db")
result = memex.ingest()  # Extracts from git commits, PRs, prompts
print(f"Ingested {result['ingested']} intent records")

# Query intent
records = memex.query_intent(symbol="authenticate")
scores = memex.score_relevance("authenticate", records)

5. Green Guard

Tracks energy consumption and enforces sustainability thresholds:

from siof.green_guard import GreenGuard

guard = GreenGuard("siof.db")
result = guard.run_command("pytest", hard_co2_kg=0.1)
print(f"Energy: {result.energy_wh:.4f} Wh, CO2: {result.co2_kg:.6f} kg")

# Sustainability report
report = guard.sustainability_report()
print(f"Total runs: {report['total_runs']}")
print(f"Total CO2: {report['total_co2_kg']:.6f} kg")

Testing

SIOF requires pytest. Tests can be run after installation with:

pytest tests/

All 242 tests pass in ~11 seconds.

Architecture

graph TD
    CLI[CLI Interface<br/>siof index/slop/mcp/memex/green]
    API[Python API<br/>SIOFOrchestrator]
    
    CLI --> ORCH
    API --> ORCH
    
    ORCH[Orchestrator<br/>Pipeline Manager]
    
    ORCH --> IDX[DTG Indexer<br/>Graph Construction]
    ORCH --> SLOP[De-Slopper<br/>Anti-Pattern Detection]
    ORCH --> MCP[MCP Server<br/>Agent Interface]
    ORCH --> MEM[Memex<br/>Intent Extraction]
    ORCH --> GREEN[Green Guard<br/>Sustainability]
    
    IDX --> REPO[Repository Layer<br/>File I/O + AST]
    SLOP --> REPO
    MCP --> REPO
    MEM --> REPO
    GREEN --> REPO
    
    REPO --> DB[(SQLite<br/>DTG + Metadata)]
    
    MCP --> POL[Policy Engine<br/>RBAC + Rate Limit]
    MEM --> INTENT[Intent Extractor<br/>Git + Prompts]
    GREEN --> ENERGY[Energy Calculator<br/>CO2 Tracking]
    
    style CLI fill:#e1f5ff
    style API fill:#e1f5ff
    style ORCH fill:#fff4e1
    style IDX fill:#e8f5e9
    style SLOP fill:#e8f5e9
    style MCP fill:#e8f5e9
    style MEM fill:#e8f5e9
    style GREEN fill:#e8f5e9
    style REPO fill:#ffe4e1
    style DB fill:#f3e5f5
    style POL fill:#fff9e6
    style INTENT fill:#fff9e6
    style ENERGY fill:#fff9e6

Why SIOF?

The AI-native development era (vibe coding) has introduced a new class of technical debt: AI slop. LLMs generate code probabilistically, leading to:

  • Silent error swallowing via bare except: pass
  • Hallucinated imports and dead code paths
  • Verbose, meaningless documentation
  • Loss of architectural intent over time

Traditional linters (Pylint, Flake8, Ruff) catch syntax errors but miss semantic anti-patterns. SIOF bridges this gap with:

  1. DTG-based analysis - Understand data lineage, not just control flow
  2. Deterministic de-slopping - Fix AI-specific anti-patterns automatically
  3. MCP integration - Give AI agents proper context (120x token reduction)
  4. Intent preservation - Maintain the "why" behind the code
  5. Sustainability - Track and limit computational waste

Roadmap

v1.0 (Current) ✅

  • DTG Indexer with incremental updates
  • De-Slopper with audit/fix/strict modes
  • MCP server with RBAC and rate limiting
  • Memex intent extraction
  • Green Guard sustainability tracking

v2.0 (Planned)

  • Free-threaded parsing (10x speedup on Python 3.14+)
  • Distributed graph storage (Neo4j/FalkorDB)
  • Enterprise MCP server (JWT, Redis, stateless)
  • Vector-based semantic search (Milvus)
  • Edge deployment (K3s, regional caching)
  • Kubernetes orchestration (Helm charts)
  • Full observability stack (OpenTelemetry, Prometheus, Grafana)

Contributing

SIOF welcomes contributions! Whether you're fixing bugs, adding features, improving documentation, or reporting issues, your help is appreciated.

Ways to Contribute

  • Report bugs and request features via GitHub Issues
  • Submit pull requests for bug fixes or new features
  • Improve documentation and examples
  • Share your use cases and feedback

Development Setup

git clone https://github.com/Keerthivasan-Venkitajalam/SIOF.git
cd SIOF
pip install -e ".[dev,test]"
pytest tests/

License

SIOF is released under the MIT License.

Author

Created by Keerthivasan S V - Built for the AI-native development era.

Citation

If you use SIOF in your research or project, please cite:

@software{siof2026,
  author = {Keerthivasan S V},
  title = {SIOF: Semantic Integrity and Orchestration Framework},
  year = {2026},
  url = {https://github.com/Keerthivasan-Venkitajalam/SIOF}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

siof-1.0.0.tar.gz (65.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

siof-1.0.0-py3-none-any.whl (42.9 kB view details)

Uploaded Python 3

File details

Details for the file siof-1.0.0.tar.gz.

File metadata

  • Download URL: siof-1.0.0.tar.gz
  • Upload date:
  • Size: 65.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for siof-1.0.0.tar.gz
Algorithm Hash digest
SHA256 47cbd49caf638ba11b3f9f0d6e6c1571ff122aa6465e9b89af66d067cd2dbc34
MD5 7102da89816441a12232d20fdae12cc6
BLAKE2b-256 0f6083a957668fa91854009ca2231ee8428bc8e3282e3967f7d13c893a40de59

See more details on using hashes here.

Provenance

The following attestation bundles were made for siof-1.0.0.tar.gz:

Publisher: publish.yml on Keerthivasan-Venkitajalam/SIOF

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file siof-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: siof-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 42.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for siof-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6cea41b7c2fcfc3b3b8ade77cd9dac7cf66b9d844c7e8924f48ede7415241509
MD5 5b3ae243fc942f039b3fd4be4fe4bf89
BLAKE2b-256 8f385dcc1234954fd00a68b24019eb4ee829b8e0567e22784d26a53f3b27057a

See more details on using hashes here.

Provenance

The following attestation bundles were made for siof-1.0.0-py3-none-any.whl:

Publisher: publish.yml on Keerthivasan-Venkitajalam/SIOF

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page