Skip to main content

Semantic compiler: Code → AST → CQRS Model → Workflow DAG → Proto/Schema

Project description

code2schema

AI Cost Tracking

PyPI Version Python License AI Cost Human Time Model

  • 🤖 LLM usage: $0.1500 (1 commits)
  • 👤 Human dev: ~$100 (1.0h @ $100/h, 30min dedup)

Generated on 2026-05-04 using openrouter/qwen/qwen3-coder-next


Semantic Compiler for Software Systems

Przekształca kod Python w model semantyczny CQRS → kontrakty API → graf architektury.

CODE (.py)
  ⬇  AST extraction (built-in ast)
  ⬇  CQRS inference (Query / Command / Orchestrator)
  ⬇  Call Graph (NetworkX)
  ⬇  Event Model (DDD / Event Sourcing)
  ⬇  Workflow DAG
  ⬇  Quality Rules
  ↓
JSON schema · .proto · Markdown · GraphML · DOT

Instalacja

pip install code2schema

Opcjonalne backendy:

pip install "code2schema[proto]"   # grpcio-tools → kompilacja .proto
pip install "code2schema[neo4j]"   # eksport do bazy grafowej
pip install "code2schema[viz]"     # pyvis → wizualizacja HTML
pip install "code2schema[dev]"     # pytest + ruff + black

Szybki start

# Pełna analiza projektu
code2schema ./moj_projekt \
  --out schema.json \
  --proto api.proto \
  --md report.md \
  --graphml graph.graphml \
  --dot graph.dot \
  --graph-summary \
  --events \
  --cycles

Przykładowy output:

✅ Gotowe (1.2s)
   Modules  : 87
   Functions: 883
   Queries  : 612
   Commands : 184
   Orchest. : 87
   Workflows: 87
   Rules    : 43
   Graph    : 883N / 1204E
   → schema.json

Użycie w kodzie

from code2schema import extract_project, analyze
from code2schema.analyzer.graph import build_rich_graph, graph_summary
from code2schema.analyzer.events import infer_event_model
from code2schema.codegen import to_proto, to_markdown
from pathlib import Path

# Ekstrakcja + analiza
modules = extract_project(Path("./backend"))
schema  = analyze(modules)

# CQRS
print(f"Queries:      {len(schema.queries())}")
print(f"Commands:     {len(schema.commands())}")
print(f"Orchestrators:{len(schema.orchestrators())}")

# Graf
G = build_rich_graph(schema)
print(graph_summary(G, schema))

# Event Model (DDD)
em = infer_event_model(modules)
print(em.summary())

# Eksport
print(to_proto(schema))
print(to_markdown(schema))

Architektura paczki

code2schema/
├── core/
│   ├── models.py       # IR: FunctionIR, ModuleIR, SchemaIR (Pydantic)
│   └── extractor.py    # AST parser (stdlib ast, bez zależności)
├── analyzer/
│   ├── cqrs.py         # CQRS inference + WorkflowDAG + Rules
│   ├── graph.py        # NetworkX: centrality, hubs, cycles, GraphML/DOT
│   └── events.py       # DDD / Event Sourcing inference
├── codegen/
│   └── __init__.py     # JSON, .proto, Markdown generators
└── cli.py              # entry point: code2schema <path> [flags]

Klasy CQRS

Rola Kryterium
query brak side-effectów, fan-out = 0
command side-effects (IO, network, DB)
orchestrator fan-out ≥ 5 (wywołuje wiele funkcji)

Reguły jakości

Automatycznie generowane:

ID Warunek Akcja
HIGH_FAN_OUT fan-out ≥ 10 refactor_to_service
LONG_FUNCTION lines > 100 split_function
QUERY_WITH_SIDE_EFFECTS query + IO separate_command_from_query

Graph export

GraphML można otworzyć w Gephi, yEd lub zaimportować do Neo4j.
DOT renderuje Graphviz: dot -Tsvg graph.dot -o graph.svg

Biblioteki

Kategoria Biblioteka Po co
Parsing ast (stdlib) Parsowanie Python bez zależności
Parsing libcst Modyfikacja kodu z zachowaniem formatowania
Parsing tree-sitter Multi-language (Go, TS, Rust) — v4
Graf networkx Call graph, centrality, cykle
Graf neo4j Eksport do bazy grafowej
Wizualizacja pyvis Interaktywny HTML
Schema pydantic IR models + walidacja
Proto grpcio-tools Kompilacja .proto → kod

Roadmap

  • v0.1 — AST extraction, CQRS inference, JSON/Proto/MD output
  • v0.1 — NetworkX call graph, GraphML/DOT export, PageRank
  • v0.1 — Event Model (DDD), layer violation detection
  • v0.2 — libcst extractor (zachowanie formatowania, transformacje)
  • v0.2 — tree-sitter multi-language (Go, TypeScript, Rust)
  • v0.3 — Neo4j export, pyvis HTML visualization
  • v0.3 — Cross-language code generator (proto → Go/TS stubs)
  • v0.4 — Data Flow Graph (DFG), State model extraction

Licencja

MIT

License

Licensed under Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code2schema-0.1.1.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code2schema-0.1.1-py3-none-any.whl (25.9 kB view details)

Uploaded Python 3

File details

Details for the file code2schema-0.1.1.tar.gz.

File metadata

  • Download URL: code2schema-0.1.1.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for code2schema-0.1.1.tar.gz
Algorithm Hash digest
SHA256 15b3b614db698453ba144ec07b5a0c8db80130db43cc002f4d4fb619575b30e1
MD5 e122ace5af379f5928b181cd35b7ef82
BLAKE2b-256 f37878dd5ebc37be29004d8440c86500d82a2e0f65e331c09ff5fec7c3a37b1d

See more details on using hashes here.

File details

Details for the file code2schema-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: code2schema-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 25.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for code2schema-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4cca9cafe3238167d4e3a96b20d1c6bcd707422e7b834a6dfaf51e986689ed02
MD5 5e80e7444fa8e63816f8bb4f1aa05595
BLAKE2b-256 ec4fd913935385a2e70221d1136aec34c8ff66b344a3a4aebfee3bcebebc961e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page