PythoC: A Python DSL compiler that maps statically-typed Python subset to LLVM IR, providing C-equivalent capabilities with Python syntax
Project description
PythoC: Write C in Python
PythoC is a Python DSL compiler that compiles statically-typed Python to LLVM IR, providing C-equivalent runtime capabilities with Python syntax and compile-time metaprogramming.
Design Philosophy
Core principle: C-level runtime + Python-powered compile-time
- C-compatible runtime: Compiled code maps directly to native machine code with C-level control and performance
- Full access to low-level operations (pointers, manual memory, inline assembly)
- C calling conventions for seamless interoperability
- No runtime overhead beyond what C would have
- Compile-time = Python: Full Python power for metaprogramming, generics, and code generation
- Zero-cost abstractions: Python-level abstractions compile away completely
- Explicit typing: All types must be annotated (like C, unlike Python)
- Explicit control flow: No implicit control flow (no exceptions, no RAII, no destructors)
- Structs are plain data (no methods, no constructors)
- Manual resource management like C
- Optional safety features: Linear types and refinement types provide memory safety guarantees without introducing hidden control flow
- Prevent memory leaks, use-after-free, null pointer dereference, array out-of-bounds
- Completely optional - use only when needed
- No extra runtime overhead
- Convenient Python-C interoperability:
- Python can call PythoC compiled functions at runtime (via ctypes/cffi)
- PythoC can invoke Python code at compile-time for metaprogramming
Quick Start
Installation
pip install pythoc
Hello World
from pythoc import compile, i32
@compile
def add(x: i32, y: i32) -> i32:
return x + y
# Can compile to native code
@compile
def main() -> i32:
return add(10, 20)
# Or call the compiled dynamic library from Python directly
result = main()
Run Tests
# Run all tests
python test/run_all_tests.py
# Run specific test suites
python test/run_integration_tests.py
python test/run_examples.py
Example: Binary Tree Benchmark
This example demonstrates PythoC's direct mapping to C - compare with test/example/base_binary_tree_test.c:
from __future__ import annotations
from pythoc import i32, ptr, compile, nullptr, seq, sizeof
from pythoc.libc.stdlib import malloc, free
from pythoc.libc.stdio import printf
# C: typedef struct tn { struct tn* left; struct tn* right; } treeNode;
@compile
class TreeNode:
left: ptr[TreeNode]
right: ptr[TreeNode]
# C: treeNode* NewTreeNode(treeNode* left, treeNode* right)
@compile
def NewTreeNode(left: ptr[TreeNode], right: ptr[TreeNode]) -> ptr[TreeNode]:
new: ptr[TreeNode] = ptr[TreeNode](malloc(sizeof(TreeNode)))
new.left = left
new.right = right
return new
# C: long ItemCheck(treeNode* tree)
@compile
def ItemCheck(tree: ptr[TreeNode]) -> i32:
if tree.left == nullptr:
return 1
else:
return 1 + ItemCheck(tree.left) + ItemCheck(tree.right)
# C: treeNode* BottomUpTree(unsigned depth)
@compile
def BottomUpTree(depth: i32) -> ptr[TreeNode]:
if depth > 0:
return NewTreeNode(BottomUpTree(depth - 1), BottomUpTree(depth - 1))
else:
return NewTreeNode(nullptr, nullptr)
# C: void DeleteTree(treeNode* tree)
@compile
def DeleteTree(tree: ptr[TreeNode]):
if tree.left != nullptr:
DeleteTree(tree.left)
DeleteTree(tree.right)
free(tree)
C Parity: Full C Capabilities
Supported Features
PythoC provides complete C runtime capabilities:
Primitive types:
- Integers:
i8,i16,i32,i64,u8,u16,u32,u64 - Floats:
f16,f32,f64,bf16,f128 - Boolean:
bool
Composite types:
- Pointers:
ptr[T] - Arrays:
array[T, N]orarray[T, N, M, ...]for multi-dimensional - Structs:
@compile class,@struct class,struct[x: i32, y: i32](named) orstruct[i32, i32](unnamed) - Unions:
@union classorunion[T1, T2, ...] - Enums:
@enum class - Function pointers:
func[[arg_types], return_type]
Control flow:
if/else,while,forloopsbreak,continue,return- Pattern matching:
match/case(enhanced switch) - Scoped
gotoand labels for low-level control (similar to labeled continue/break, limited but safer)
Operations:
- Arithmetic:
+,-,*,/,%,// - Comparison:
==,!=,<,>,<=,>= - Logical:
and,or,not - Bitwise:
&,|,^,~,<<,>> - Pointer arithmetic and dereferencing
C Standard Library:
from pythoc.libc.stdio import printf, scanf
from pythoc.libc.stdlib import malloc, free, atoi
from pythoc.libc.string import memcpy, strlen
from pythoc.libc.math import sin, cos, pow
Not Yet Supported
Features deliberately excluded or pending:
- Fall-through
switch: Usematch/casewith explicit branches - Global variable initialization: Workaround with init functions and effect system
- Variable-length arrays (VLA): Use fixed-size arrays or malloc
- Flexible array members: Use separate size tracking
PythoC Language Core
Beyond C parity, PythoC adds modern type system features (all optional, minimal support):
Algebraic Data Types (ADT) and Pattern Matching
Provide Rust-like enums with payload for type-safe tagged unions:
from pythoc import enum, compile, i32
@enum
class Result:
Ok: i32
Err: i32
@compile
def handle_result(r: Result) -> i32:
match r:
case (Result.Ok, value):
return value
case (Result.Err, code):
return -code
Linear Types (Optional)
Prevent use-after-free and resource leaks without RAII or destructors:
from pythoc import compile, linear, consume, void, ptr, i8, i32, struct
from pythoc.libc.stdlib import malloc, free
# Allocator returns resource + linear proof
@compile
def lmalloc(size: i32) -> struct[ptr[i8], linear]:
return malloc(size), linear()
# Only the paired deallocator can consume the proof
@compile
def lfree(ptr: ptr[i8], prf: linear) -> void:
free(ptr)
consume(prf) # Proof consumed - resource released
@compile
def safe_usage() -> void:
mem, prf = lmalloc(100)
# ... use mem ...
lfree(mem, prf) # Must call lfree to consume prf
# Compile error if prf not consumed!
Motivation: Proof-carrying code pattern. The linear proof proves resource was allocated and must be deallocated. Compiler enforces pairing of alloc/free at compile-time.
Refinement Types (Optional)
Check once, use safely everywhere without runtime overhead:
from pythoc import compile, i32, bool, refined, refine, array, ptr, nullptr
# Non-null pointer - check once, use safely everywhere
@compile
def is_nonnull(p: ptr[i32]) -> bool:
return p != nullptr
NonNull = refined[is_nonnull]
@compile
def process_data(p: ptr[i32]) -> i32:
for ptr_checked in refine(p, is_nonnull):
# Type system knows ptr_checked is non-null
return access_ptr(ptr_checked)
else:
return -1 # Handle null case
@compile
def access_ptr(p: refined[is_nonnull]) -> i32:
return p[0]
# Array bounds - check once, access many arrays safely
@compile
def is_valid_index(idx: i32) -> bool:
return 0 <= idx < 10
@compile
def process_arrays(i: i32, arr1: array[i32, 10], arr2: array[i32, 10]) -> i32:
for idx in refine(i, is_valid_index):
# Type system remembers idx is valid
a = arr1[idx[0]] # Safe: no bounds check
b = arr2[idx[0]] # Safe: no bounds check
c = arr1[idx[0]] # Safe: reuse safely
return a + b + c
else:
return -1
Motivation: Check once, encode in type system, use safely everywhere. Typical examples: non-null pointers, valid array indices, non-zero divisors. The type system remembers the property, eliminating redundant runtime checks. Zero overhead for subsequent uses.
Python as Preprocessor
Use Python's full power at compile-time for metaprogramming:
Compile-Time Constants: Zero-Size Fields
from pythoc import compile, i32, pyconst, struct, typeof, ptr
def Vec(T, size_type):
return struct['size': size_type, 'data': ptr[T]]
def make_vec(T, size):
ret: Vec(T, typeof(size))
ret.size = size
ret.data = malloc(size * sizeof(T))
return ret
@compile
def test()-> i32:
static_vec = make_vec(i32, 100) # sizeof(static_vec) == sizeof(ptr[T])
len: i32 = 100
dynamic_vec = make_vec(i32, len) # sizeof(dynamic_vec) == sizeof(struct[i32, ptr[T])
Use case: One generic definition, two instantiations with different memory layouts - static saves 4 bytes per instance.
Generic Types via Python Functions
from pythoc import compile, struct, i32, f64
# Python function generates specialized types and functions
def make_point(T):
@struct(suffix=T) # suffix creates unique type name
class Point:
x: T
y: T
@compile(suffix=T) # suffix creates unique function name
def add_points(p1: Point, p2: Point) -> Point:
result: Point = Point()
result.x = p1.x + p2.x
result.y = p1.y + p2.y
return result
return Point, add_points
# Generate specialized versions at compile-time
Point_i32, add_i32 = make_point(i32)
Point_f64, add_f64 = make_point(f64)
@compile
def test() -> i32:
p1: Point_i32 = Point_i32()
p1.x = 10
p2: Point_i32 = Point_i32()
p2.x = 20
result: Point_i32 = add_i32(p1, p2)
return result.x
Code Generation at Compile-Time
# Python computes values and generates code before compilation
UNROLL_COUNT = 4
def make_unrolled_sum(n):
@compile
def sum_unrolled(arr: array[i32, n]) -> i32:
result: i32 = 0
# Python loop runs at compile-time, generates n add operations
for i in range(n):
result = result + arr[i]
return result
return sum_unrolled
# Generate specialized unrolled versions
sum_4 = make_unrolled_sum(4)
sum_8 = make_unrolled_sum(8)
Library: Polymorphism
PythoC provides compile-time and runtime polymorphism through the Poly library:
from pythoc import compile, i32, f64, ptr, i8
from pythoc.std.poly import Poly
# Define specialized implementations (must have same return type)
@compile
def add_i32(a: i32, b: i32) -> i32:
return a + b
@compile
def add_f64(a: f64, b: f64) -> i32:
return i32(a + b) # Convert to i32 for uniform return type
@compile
def add_mixed(a: i32, b: f64) -> i32:
return i32(f64(a) + b)
# Create polymorphic function - dispatches based on argument types
add = Poly(add_i32, add_f64, add_mixed)
# Static dispatch - type known at compile-time (zero overhead)
@compile
def test_static():
x = add(10, 20) # Calls add_i32
y = add(1.5, 2.5) # Calls add_f64
z = add(10, 2.5) # Calls add_mixed
# Extensible - add implementations dynamically
@compile
def add_str(a: ptr[i8], b: ptr[i8]) -> i32:
return 42
add.append(add_str)
# Runtime dispatch via enum boxing
@enum(i32)
class Number:
Int: i32
Float: f64
@compile
def test_runtime():
a: Number = Number(Number.Int, 42)
b: Number = Number(Number.Float, 3.14)
add(a, b) # Dispatches based on runtime enum tags
Key features:
- All implementations must have the same return type
- Compile-time dispatch when argument types are statically known
- Runtime dispatch via enum boxing for dynamic polymorphism
- Extensible - add new implementations via
append()
AST Transformations Features: Inline, Yield, Closures
These features share the same underlying implementation: AST-level code transformation and inlining.
Inline Functions
Zero-overhead abstraction - function body is expanded at call site:
from pythoc import compile, inline, i32
@inline
def clamp(value: i32, min_val: i32, max_val: i32) -> i32:
if value < min_val:
return min_val
elif value > max_val:
return max_val
return value
@compile
def process(x: i32) -> i32:
# clamp body is inlined here - no function call
result: i32 = clamp(x, 0, 100)
return result
Closures
Nested functions with variable capture - inlined automatically:
from pythoc import compile, i32
@compile
def make_adder(base: i32) -> i32:
offset: i32 = 10
# Closure captures base and offset
def add_both(x: i32) -> i32:
return x + base + offset
return add_both(5) # Closure body inlined at call site
Yield-based Iterators
Generator functions using yield - inlined automatically at call sites:
from pythoc import compile, i32
@compile
def fibonacci(limit: i32) -> i32:
"""Generator yielding Fibonacci numbers < limit"""
a: i32 = 0
b: i32 = 1
while a < limit:
yield a
new_a: i32 = b
new_b: i32 = a + b
a = new_a
b = new_b
@compile
def sum_fibonacci(n: i32) -> i32:
total: i32 = 0
for num in fibonacci(n): # fibonacci body inlined here
total = total + num
return total
Common kernel: All three features use the same AST transformation and inlining kernel, maintaining C-level performance with high-level abstractions.
Features in Development
These are actively being designed/implemented and may change:
- Effect system: More like a compile-time injection system. Make side effects and global state explicit and manageable. Can also be used to address the symbol conflict issue.
cimport: Import C headers and auto-generate bindings (types, constants, function signatures).defer: Scope-exit cleanup (e.g.,defer(free, p)), lowered to explicit control flow.
More Examples
An under development pure pythoc C header parser in pythoc/bindings to implement the cimport feature.
More examples can be found in the test/ directory.
- Full examples:
test/example/ - PythoC Integration tests:
test/integration/
License
MIT License - see LICENSE file for details
Contact: yfleiii@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pythoc-0.3.1.tar.gz.
File metadata
- Download URL: pythoc-0.3.1.tar.gz
- Upload date:
- Size: 324.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2af375e2d2f59e69c2506b9e9ab5a150edbc9f7b529632674c95bdd523bdca6
|
|
| MD5 |
288eba40f5786fbf730cfde75a599a05
|
|
| BLAKE2b-256 |
a8b5df2c51dafebf01a53506607a5a23c1c0bf36e59d46a2bce3c495f2df800c
|
Provenance
The following attestation bundles were made for pythoc-0.3.1.tar.gz:
Publisher:
publish.yml on 1flei/PythoC
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pythoc-0.3.1.tar.gz -
Subject digest:
a2af375e2d2f59e69c2506b9e9ab5a150edbc9f7b529632674c95bdd523bdca6 - Sigstore transparency entry: 815768433
- Sigstore integration time:
-
Permalink:
1flei/PythoC@0585c6a9c75092762c031c9f6b9970fc3a87df10 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/1flei
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0585c6a9c75092762c031c9f6b9970fc3a87df10 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pythoc-0.3.1-py3-none-any.whl.
File metadata
- Download URL: pythoc-0.3.1-py3-none-any.whl
- Upload date:
- Size: 380.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84c38a671fc42e9dd29029fd8195a8a71534dbaa7f244b94074aa1fe21fe9d2a
|
|
| MD5 |
8bf8bbbc81737e91a27e3a003d61ed0c
|
|
| BLAKE2b-256 |
d15953caf7e66efc8ab190cca850748952dcd1b915ba6dbcd92665c7f535386a
|
Provenance
The following attestation bundles were made for pythoc-0.3.1-py3-none-any.whl:
Publisher:
publish.yml on 1flei/PythoC
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pythoc-0.3.1-py3-none-any.whl -
Subject digest:
84c38a671fc42e9dd29029fd8195a8a71534dbaa7f244b94074aa1fe21fe9d2a - Sigstore transparency entry: 815768441
- Sigstore integration time:
-
Permalink:
1flei/PythoC@0585c6a9c75092762c031c9f6b9970fc3a87df10 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/1flei
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0585c6a9c75092762c031c9f6b9970fc3a87df10 -
Trigger Event:
push
-
Statement type: