A C compiler written in Python
Project description
ShivyC
A hobby C compiler created in Python.
ShivyC is a hobby C compiler written in Python that supports a subset of the C11 standard and generates reasonably efficient binaries, including some optimizations. ShivyC also generates helpful compile-time error messages.
This implementation of a trie is an example of what ShivyC can compile today. For a more comprehensive list of features, see the feature test directory.
Quickstart
x86-64 Linux
ShivyC requires only Python 3 to compile C code. Assembling and linking are done using the GNU binutils and glibc, which you almost certainly already have installed.
To install ShivyC:
pip3 install shivyc
To create, compile, and run an example program:
$ vim hello.c
$ cat hello.c
#include <stdio.h>
int main() {
printf("hello, world!\n");
}
$ shivyc hello.c
$ ./out
hello, world!
To run the tests:
git clone https://github.com/ShivamSarodia/ShivyC.git
cd ShivyC
python3 -m unittest discover
Other Architectures
For the convenience of those not running Linux, the docker/
directory provides a Dockerfile that sets up an x86-64 Linux Ubuntu environment with everything necessary for ShivyC. To use this, run:
git clone https://github.com/ShivamSarodia/ShivyC.git
cd ShivyC
docker build -t shivyc docker/
docker/shell
This will open up a shell in an environment with ShivyC installed and ready to use with
shivyc any_c_file.c # to compile a file
python3 -m unittest discover # to run tests
The Docker ShivyC executable will update live with any changes made in your local ShivyC directory.
Implementation Overview
Preprocessor
ShivyC today has a very limited preprocessor that parses out comments and expands #include
directives. These features are implemented between lexer.py
and preproc.py
.
Lexer
The ShivyC lexer is implemented primarily in lexer.py
. Additionally, tokens.py
contains definitions of the token classes used in the lexer and token_kinds.py
contains instances of recognized keyword and symbol tokens.
Parser
The ShivyC parser uses recursive descent techniques for all parsing. It is implented in parser/*.py
and creates a parse tree of nodes defined in tree/nodes.py
and tree/expr_nodes.py
.
IL generation
ShivyC traverses the parse tree to generate a flat custom IL (intermediate language). The commands for this IL are in il_cmds/*.py
. Objects used for IL generation are in il_gen.py
, but most of the IL generating code is in the make_code
function of each tree node in tree/*.py
.
ASM generation
ShivyC sequentially reads the IL commands, converting each into Intel-format x86-64 assembly code. ShivyC performs register allocation using George and Appel’s iterated register coalescing algorithm (see References below). The general ASM generation functionality is in asm_gen.py
, but much of the ASM generating code is in the make_asm
function of each IL command in il_cmds/*.py
.
Contributing
ShivyC has so far been an entirely individual project, but pull requests are very welcome! Please add test(s) for all new functionality.
References
- ShivC - ShivyC is a rewrite from scratch of my old C compiler, ShivC, with much more emphasis on feature completeness and code quality. See the ShivC README for more details.
- C11 Specification - http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
- x86_64 ABI - http://web.archive.org/web/20160801075139/http://www.x86-64.org/documentation/abi.pdf
- Iterated Register Coalescing (George and Appel) - https://www.cs.purdue.edu/homes/hosking/502/george.pdf
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.