Code2Logic - Source code to logical representation converter for LLM analysis, featuring Tree-sitter parsing, dependency graph analysis, and multi-language support.
Project description
Code2Logic
Convert source code to logical representation for LLM analysis.
Code2Logic analyzes codebases and generates compact, LLM-friendly representations with semantic understanding. Perfect for feeding project context to AI assistants, building code documentation, or analyzing code structure.
โจ Features
- ๐ณ Multi-language support - Python, JavaScript, TypeScript, Java, Go, Rust, and more
- ๐ฏ Tree-sitter AST parsing - 99% accuracy with graceful fallback
- ๐ NetworkX dependency graphs - PageRank, hub detection, cycle analysis
- ๐ Rapidfuzz similarity - Find duplicate and similar functions
- ๐ง NLP intent extraction - Human-readable function descriptions
- ๐ฆ Zero dependencies - Core works without any external libs
๐ Installation
Basic (no dependencies)
pip install code2logic
Full (all features)
pip install code2logic[full]
Selective features
pip install code2logic[treesitter] # High-accuracy AST parsing
pip install code2logic[graph] # Dependency analysis
pip install code2logic[similarity] # Similar function detection
pip install code2logic[nlp] # Enhanced intents
๐ Quick Start
# TOON compact (best token efficiency โ 5.9x smaller than JSON)
code2logic ./ -f toon --compact --name project -o ./
# TOON with function-logic + structural context
code2logic ./ -f toon --compact --no-repeat-module \
--function-logic function.toon --function-logic-context minimal --name project -o ./
# TOON-Hybrid (project structure + function details for hub modules)
code2logic ./ -f toon --hybrid --no-repeat-module --name project -o ./
# YAML compact (human-readable, good compromise)
code2logic ./ -f yaml --compact --name project -o ./
Command Line
# Standard Markdown output
code2logic /path/to/project
# If the `code2logic` entrypoint is not available (e.g. running from source without install):
python -m code2logic /path/to/project
# Compact YAML (14% smaller, meta.legend transparency)
code2logic /path/to/project -f yaml --compact -o analysis-compact.yaml
# Ultra-compact TOON (71% smaller, single-letter keys)
code2logic /path/to/project -f toon --ultra-compact -o analysis-ultra.toon
# Generate schema alongside output
code2logic /path/to/project -f yaml --compact --with-schema
# With detailed analysis
code2logic /path/to/project -d detailed
Python API
from code2logic import analyze_project, MarkdownGenerator
# Analyze a project
project = analyze_project("/path/to/project")
# Generate output
generator = MarkdownGenerator()
output = generator.generate(project, detail_level='standard')
print(output)
# Access analysis results
print(f"Files: {project.total_files}")
print(f"Lines: {project.total_lines}")
print(f"Languages: {project.languages}")
# Get hub modules (most important)
hubs = [p for p, n in project.dependency_metrics.items() if n.is_hub]
print(f"Key modules: {hubs}")
Organized Imports
# Core analysis
from code2logic import ProjectInfo, ProjectAnalyzer, analyze_project
# Format generators
from code2logic import (
YAMLGenerator,
JSONGenerator,
TOONGenerator,
LogicMLGenerator,
GherkinGenerator,
)
# LLM clients
from code2logic import get_client, BaseLLMClient
# Development tools
from code2logic import run_benchmark, CodeReviewer
๐ Output Formats
Markdown (default)
Human-readable documentation with:
- Project structure tree with hub markers (โ )
- Dependency graphs with PageRank scores
- Classes with methods and intents
- Functions with signatures and descriptions
Compact
Ultra-compact format optimized for LLM context:
# myproject | 102f 31875L | typescript:79/python:23
ENTRY: index.ts main.py
HUBS: evolution-manager llm-orchestrator
[core/evolution]
evolution-manager.ts (3719L) C:EvolutionManager | F:createEvolutionManager
task-queue.ts (139L) C:TaskQueue,Task
JSON
Machine-readable format for:
- RAG (Retrieval-Augmented Generation)
- Database storage
- Further analysis
๐ง Configuration
Library Status
Check which features are available:
code2logic --status
Library Status:
tree_sitter: โ
networkx: โ
rapidfuzz: โ
nltk: โ
spacy: โ
LLM Configuration
Manage LLM providers, models, API keys, and routing priorities:
code2logic llm status
code2logic llm set-provider auto
code2logic llm set-model openrouter nvidia/nemotron-3-nano-30b-a3b:free
code2logic llm key set openrouter <OPENROUTER_API_KEY>
code2logic llm priority set-provider openrouter 10
code2logic llm priority set-mode provider-first
code2logic llm priority set-llm-model nvidia/nemotron-3-nano-30b-a3b:free 5
code2logic llm priority set-llm-family nvidia/ 5
code2logic llm config list
Notes:
code2logic llm set-provider autoenables automatic fallback selection: providers are tried in priority order.- API keys should be stored in
.env(or environment variables), not inlitellm_config.yaml. - These commands write configuration files:
.envin the current working directorylitellm_config.yamlin the current working directory~/.code2logic/llm_config.jsonin your home directory
Priority modes
You can choose how automatic fallback ordering is computed:
provider-firstproviders are ordered by provider priority (defaults + overrides)model-firstproviders are ordered by priority rules for the provider's configured model (exact/prefix)mixedproviders are ordered by the best (lowest) priority from either provider priority or model rules
Configure the mode:
code2logic llm priority set-mode provider-first
code2logic llm priority set-mode model-first
code2logic llm priority set-mode mixed
Model priority rules are stored in ~/.code2logic/llm_config.json.
Python API (Library Status)
from code2logic import get_library_status
status = get_library_status()
# {'tree_sitter': True, 'networkx': True, ...}
๐ Analysis Features
Dependency Analysis
- PageRank - Identifies most important modules
- Hub detection - Central modules marked with โ
- Cycle detection - Find circular dependencies
- Clustering - Group related modules
Intent Generation
Functions get human-readable descriptions:
methods:
async findById(id:string) -> Promise<User> # retrieves user by id
async createUser(data:UserDTO) -> Promise<User> # creates user
validateEmail(email:string) -> boolean # validates email
Similarity Detection
Find duplicate and similar functions:
Similar Functions:
core/auth.ts::validateToken:
- python/auth.py::validate_token (92%)
- services/jwt.ts::verifyToken (85%)
๐๏ธ Architecture
code2logic/
โโโ analyzer.py # Main orchestrator
โโโ parsers.py # Tree-sitter + fallback parser
โโโ dependency.py # NetworkX dependency analysis
โโโ similarity.py # Rapidfuzz similar detection
โโโ intent.py # NLP intent generation
โโโ generators.py # Output generators (MD/Compact/JSON/YAML/CSV)
โโโ toon_format.py # TOON generator (compact, hybrid)
โโโ logicml.py # LogicML generator (typed signatures)
โโโ function_logic.py # Function-logic TOON with structural context
โโโ metrics.py # AST-based quality metrics
โโโ models.py # Data structures
โโโ cli.py # Command-line interface
โโโ benchmarks/ # Benchmark runner, results, common utils
โโโ llm_clients.py # Unified LLM client (OpenRouter/Ollama/LiteLLM)
๐ Integration Examples
With Claude/ChatGPT
from code2logic import analyze_project, CompactGenerator
project = analyze_project("./my-project")
context = CompactGenerator().generate(project)
# Use in your LLM prompt
prompt = f"""
Analyze this codebase and suggest improvements:
{context}
"""
With RAG Systems
import json
from code2logic import analyze_project, JSONGenerator
project = analyze_project("./my-project")
data = json.loads(JSONGenerator().generate(project))
# Index in vector DB
for module in data['modules']:
for func in module['functions']:
embed_and_store(
text=f"{func['name']}: {func['intent']}",
metadata={'path': module['path'], 'type': 'function'}
)
๐งช Development
Setup
git clone https://github.com/wronai/code2logic
cd code2logic
poetry install --with dev -E full
poetry run pre-commit install
# Alternatively, you can use Makefile targets (prefer Poetry if available)
make install-full
Tests
make test
make test-cov
# Or directly:
poetry run pytest
poetry run pytest --cov=code2logic --cov-report=html
Type Checking
make typecheck
# Or directly:
poetry run mypy code2logic
Linting
make lint
make format
# Or directly:
poetry run ruff check code2logic
poetry run black code2logic
๐ Performance
| Codebase Size | Files | Lines | Time | Output Size |
|---|---|---|---|---|
| Small | 10 | 1K | <1s | ~5KB |
| Medium | 100 | 30K | ~2s | ~50KB |
| Large | 500 | 150K | ~10s | ~200KB |
Compact format is ~10-15x smaller than Markdown.
๐ฌ Code Reproduction Benchmarks
Benchmark results (20 files, model: arcee-ai/trinity-large-preview, 2026-02-25):
Project Benchmark โ Format Comparison
| Format | Score | Syntax OK | Runs OK | ~Tokens | Efficiency (p/kT) |
|---|---|---|---|---|---|
| toon | 63,8% | 100% | 60% | 17 875 | 3,57 |
| json | 62,9% | 100% | 60% | 104 914 | 0,60 |
| markdown | 62,5% | 100% | 55% | 36 851 | 1,70 |
| yaml | 62,4% | 100% | 55% | 68 651 | 0,91 |
| logicml | 60,4% | 100% | 55% | ~30 000 | ~2,01 |
| csv | 53,0% | 100% | 40% | 80 779 | 0,66 |
| function.toon | 49,3% | 95% | 35% | 29 271 | 1,68 |
| gherkin | 38,6% | 95% | 30% | ~25 000 | ~1,54 |
Behavioral benchmark: 85,7% (6/7 functions passed).
Key Findings
- TOON wins on efficiency โ best score (63,8%) at 5,9x fewer tokens than JSON
- Syntax OK = 100% for all major formats โ LLM always generates valid syntax
- function.toon paradox โ worse than project.toon despite larger file, due to missing class/module context (fixed in v1.0.43 with
--function-logic-context) - gherkin/csv โ poor fit for code description, their structure doesn't map to programming constructs
Run Benchmarks
make benchmark # Full benchmark suite (requires OPENROUTER_API_KEY)
# Or individually:
python examples/15_unified_benchmark.py --type format --folder tests/samples/ --limit 20
python examples/15_unified_benchmark.py --type project --folder tests/samples/ --limit 20
python examples/15_unified_benchmark.py --type function --file tests/samples/sample_functions.py
๐ค Contributing
Contributions welcome! Please read our Contributing Guide.
๐ License
Apache 2 License - see LICENSE for details.
๐ Companion Packages
logic2test - Generate Tests from Logic
Generate test scaffolds from Code2Logic output:
# Show what can be generated
python -m logic2test out/code2logic/project.c2l.yaml --summary
# Generate unit tests
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/
# Generate all test types (unit, integration, property)
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/ --type all
from logic2test import TestGenerator
generator = TestGenerator('out/code2logic/project.c2l.yaml')
result = generator.generate_unit_tests('out/logic2test/tests/')
print(f"Generated {result.tests_generated} tests")
logic2code - Generate Code from Logic
Generate source code from Code2Logic output:
# Show what can be generated
python -m logic2code out/code2logic/project.c2l.yaml --summary
# Generate Python code
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/
# Generate stubs only
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/ --stubs-only
from logic2code import CodeGenerator
generator = CodeGenerator('out/code2logic/project.c2l.yaml')
result = generator.generate('out/logic2code/generated_code/')
print(f"Generated {result.files_generated} files")
Full Workflow: Code โ Logic โ Tests/Code
# 1. Analyze existing codebase
code2logic src/ -f yaml -o out/code2logic/project.c2l.yaml
# 2. Generate tests for the codebase
python -m logic2test out/code2logic/project.c2l.yaml -o out/logic2test/tests/ --type all
# 3. Generate code scaffolds (for refactoring)
python -m logic2code out/code2logic/project.c2l.yaml -o out/logic2code/generated_code/ --stubs-only
๐ Documentation
- 00 - Docs Index - Documentation home (start here)
- 01 - Getting Started - Install and first steps
- 02 - Configuration - API keys, environment setup
- 03 - CLI Reference - Command-line usage
- 04 - Python API - Programmatic usage
- 05 - Output Formats - Format comparison and usage
- 06 - Format Specifications - Detailed format specs
- 07 - TOON Format - Token-Oriented Object Notation
- 08 - LLM Integration - OpenRouter/Ollama/LiteLLM
- 09 - LLM Comparison - Provider/model comparison
- 10 - Benchmarking - Benchmark methodology and results
- 11 - Repeatability - Repeatability testing
- 12 - Examples - Usage workflows and examples
- 13 - Architecture - System design and components
- 14 - Format Analysis - Deeper format evaluation
- 15 - Logic2Test - Test generation from logic files
- 16 - Logic2Code - Code generation from logic files
- 17 - LOLM - LLM provider management
- 18 - Reproduction Testing - Format validation and code regeneration
- 19 - Monorepo Workflow - Managing all packages from repo root
๐งฉ Examples
- examples/ - All runnable examples
- examples/run_examples.sh - Example runner script (multi-command workflows)
- examples/code2logic/ - Minimal project + docker example for code2logic
- examples/logic2test/ - Minimal project + docker example for logic2test
- examples/logic2code/ - Minimal project + docker example for logic2code
๐ Links
License
Apache License 2.0 - see LICENSE for details.
Author
Created by Tom Sapletta - tom@sapletta.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file code2logic-1.0.47.tar.gz.
File metadata
- Download URL: code2logic-1.0.47.tar.gz
- Upload date:
- Size: 192.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08a8c684ea814c3508cdff59e7cfb1fd79a6878c71d99d427ace9c18aa324f68
|
|
| MD5 |
0bc03106300b55a388a01522d37c9df6
|
|
| BLAKE2b-256 |
01fe5a9b6882eb957b3e6e846423416c161825f2d5db07994073dd9a49436aa8
|
File details
Details for the file code2logic-1.0.47-py3-none-any.whl.
File metadata
- Download URL: code2logic-1.0.47-py3-none-any.whl
- Upload date:
- Size: 214.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26a210f8deeb2a641596e261dc9b785c42d0663a54872c0ed93b82c2da0be42d
|
|
| MD5 |
c1bf1cd8f1eefda874bbb21d09509a0f
|
|
| BLAKE2b-256 |
98aa34fac06a65ada7b65e3a241b22a2d82ce209e42ad871b489e4e181c2a320
|