Full Syntax Tree for python to make writing refactoring code a realist task
Project description
Introduction
============
Baron is a Full Syntax Tree (FST) library for Python. By opposition to
an `AST <https://en.wikipedia.org/wiki/Abstract_syntax_tree>`__ which
drops some syntax information in the process of its creation (like empty
lines, comments, formatting), a FST keeps everything and guarantees the
operation fst\_to\_code(code\_to\_fst(source\_code)) == source\_code.
Installation
============
::
pip install baron
Basic Usage
===========
.. code:: python
from baron import parse, dumps
fst = parse(source_code_string)
source_code_string == dumps(fst)
Except if you want to do low level things, **use
`RedBaron <https://github.com/Psycojoker/redbaron>`__ instead of using
Baron directly**. Think of Baron as the "bytecode of python source code"
and RedBaron as some sort of usable layer on top of it.
If you don't know what Baron is or don't understand yet why it might be
useful for you, read the `« Why is this important? »
section <#why-is-this-important>`__.
Documentation
=============
Baron documentation is available on `Read The
Docs <http://baron.readthedocs.org/en/latest/>`__.
Why is this important?
======================
The usage of a FST might not be obvious at first sight so let's consider
a series of problems to illustrate it. Let's say that you want to write
a program that will:
- rename a variable in a source file... without clashing with things
that are not a variable (example: stuff inside a string)
- inline a function/method
- extract a function/method from a series of line of code
- split a class into several classes
- split a file into several modules
- convert your whole code base from one ORM to another
- do custom refactoring operation not implemented by IDE/rope
- implement the class browser of smalltalk for python (the whole one
where you can edit the code of the methods, not just showing code)
It is very likely that you will end up with the awkward feeling of
writing clumpsy weak code that is very likely to break because you
didn't thought about all the annoying special cases and the formatting
keeps bothering you. You may end up playing with
`ast.py <http://docs.python.org/2/library/ast.html>`__ until you realize
that it removes too much information to be suitable for those
situations. You will probably ditch this task as simple too complicated
and really not worth the effort. You are missing a good abstraction that
will take care of all of the code structure and formatting for you so
you can concentrate on your task.
The FST tries to be this abstraction. With it you can now work on a tree
which represents your code with its formatting. Moreover, since it is
the exact representation of your code, modifying it and converting it
back to a string will give you back your code only modified where you
have modified the tree.
Said in another way, what I'm trying to achieve with Baron is a paradigm
change in which writing code that will modify code is now a realist task
that is worth the price (I'm not saying a simple task, but a realistic
one: it's still a complex task).
Other
-----
Having a FST (or at least a good abstraction build on it) also makes it
easier to do code generation and code analysis while those two
operations are already quite feasible (using
`ast.py <http://docs.python.org/2/library/ast.html>`__ and a templating
engine for example).
Some technical details
======================
Baron produces a FST in the form of JSON (and by JSON I mean Python
lists and dicts that can be dumped into JSON) for maximum
interoperability.
Baron FST is quite similar to Python AST with some modifications to be
more intuitive to humans, since Python AST has been made for CPython
interpreter.
Since playing directly with JSON is a bit raw I'm going to build an
abstraction on top of it that will looks like BeautifulSoup/jQuery.
State of the project
====================
Currently, Baron has been tested on the top 100 projects and the FST
converts back exactly into the original source code. So, it can be
considered quite stable, but it is far away from having been battle
tested.
Since the project is very young and no one is already using it except my
project, I'm open to changes of the FST nodes but I will quickly become
conservative once it gets some adoption and will probably accept to
modify it only once or twice in the future with clear indications on how
to migrate.
**Baron is targeting python 2.[67]**. It has not been tested on python3
but should be working for most parts (except the new grammar like yield
from, obviously). Baron **runs** under python 2 and python 3.
Tests
=====
Run either ``py.test tests/`` or ``nosetests`` in the baron directory.
Community
=========
You can reach us on
`irc.freenode.net#baron <https://webchat.freenode.net/?channels=%23baron>`__.
Misc
====
`Old blog post announcing the
project. <http://worlddomination.be/blog/2013/the-baron-project-part-1-what-and-why.html>`__
Not that much up to date.
Changelog
=========
0.6.1 (2015-01-31)
------------------
- fix: the string was having a greedy behavior on grouping the string tokens
surrounding it (for string chains), this ends up creating an inconsistancy in
the way string was grouped in general
- fix: better number parsing handling, everything isn't fixed yet
0.6 (2014-12-11)
----------------
- FST structure modification: def_argument_tuple is no more and all arguments
now have a coherent structure:
* def_argument node name attribute has been renamed to target, like in assign
* target attribute now points to a dict, not to a string
* old name -> string are now target -> name_node
* def_argument_tuple is now a def_argument where target points to a tuple
* this specific tuple will only has name and comma and tuple members (no more
def_argument for name)
- new node: long, before int and long where merged but that was causing problems
0.5 (2014-11-10)
----------------
- rename "funcdef" node to "def" node to be way more intuitive.
0.4 (2014-09-29)
----------------
- new rendering type in the nodes_rendering_order dictionary: string. This
remove an ambiguity where a key could be pointing to a dict or a string, thus
forcing third party tools to do guessing.
0.3.1 (2014-09-04)
------------------
- setup.py wasn't working if wheel wasn't used because the CHANGELOG file
wasn't included in the MANIFEST.in
0.3 (2014-08-21)
----------------
- path becomes a simple list and is easier to deal with
- bounding box allows you to know the left most and right most position
of a node see https://baron.readthedocs.org/en/latest/#bounding-box
- redbaron is classified as supporting python3
https://github.com/Psycojoker/baron/pull/51
- ensure than when a key is a string, it's empty value is an empty string and
not None to avoid breaking libs that use introspection to guess the type of
the key
- key renaming in the FST: "delimiteur" -> "delimiter"
- name_as_name and dotted_as_name node don't have the "as" key anymore as it
was useless (it can be deduce from the state of the "target" key)
- dotted_name node doesn't exist anymore, its existance was unjustified. In
import, from_import and decorator node, it has been replaced from a key to a
dict (with only a list inside of it) to a simple list.
- dumps now accept a strict boolean argument to check the validity of the FST
on dumping, but this isn't that much a public feature and should probably be
changed of API in the futur
- name_as_name and dotted_as_name empty value for target is now an empty string
and not None since this is a string type key
- boundingbox now includes the newlines at the end of a node
- all raised exceptions inherit from a common base exception to ease try/catch
constructions
- Position's left and right functions become properties and thus
attributes
- Position objects can be compared to other Position objects or any
iterables
- make_position and make_bounding_box functions are deleted in favor of
always using the corresponding class' constructor
0.2 (2014-06-11)
----------------
- Baron now provides documentation on https://baron.readthedocs.org
- feature: baron now run in python3 (*but* doesn't implement the full python3
grammar yet) by Pierre Penninckx https://github.com/ibizaman
- feature: drop the usage of ast.py to find print_function, this allow any
version of python to parse any other version of python also by Pierre
Penninckx
- fix: rare bug where a comment end up being confused as an indentation level
- 2 new helpers: show_file and show_node, see https://baron.readthedocs.org/en/latest/#show-file
and https://baron.readthedocs.org/en/latest/#show-node
- new dictionary that provides the informations on how to render a FST node:
nodes_rendering_order see https://baron.readthedocs.org/en/latest/#rendering-the-fst
- new utilities to find a node, see https://baron.readthedocs.org/en/latest/#locate-a-node
- new generic class that provide templates to work on the FST see
https://baron.readthedocs.org/en/latest/#rendering-the-fst
0.1.3 (2014-04-13)
------------------
- set sugar syntaxic notation wasn't handled by the dumper (apparently no one
use this on pypi top 100)
0.1.2 (2014-04-08)
------------------
- baron.dumps now accept a single FST node, it was only working with a list of
FST nodes
- don't add a endl node at the end if not present in the input string
- de-uniformise call_arguments and function_arguments node, this is just
creating more problems that anything else
- fix https://github.com/Psycojoker/redbaron/issues/4
- fix the fact that baron can't parse "{1,}" (but "{1}" is working)
0.1.1 (2014-03-23)
------------------
- It appears that I don't know how to write MANIFEST.in correctly
0.1 (2014-03-22)
----------------
- Init
============
Baron is a Full Syntax Tree (FST) library for Python. By opposition to
an `AST <https://en.wikipedia.org/wiki/Abstract_syntax_tree>`__ which
drops some syntax information in the process of its creation (like empty
lines, comments, formatting), a FST keeps everything and guarantees the
operation fst\_to\_code(code\_to\_fst(source\_code)) == source\_code.
Installation
============
::
pip install baron
Basic Usage
===========
.. code:: python
from baron import parse, dumps
fst = parse(source_code_string)
source_code_string == dumps(fst)
Except if you want to do low level things, **use
`RedBaron <https://github.com/Psycojoker/redbaron>`__ instead of using
Baron directly**. Think of Baron as the "bytecode of python source code"
and RedBaron as some sort of usable layer on top of it.
If you don't know what Baron is or don't understand yet why it might be
useful for you, read the `« Why is this important? »
section <#why-is-this-important>`__.
Documentation
=============
Baron documentation is available on `Read The
Docs <http://baron.readthedocs.org/en/latest/>`__.
Why is this important?
======================
The usage of a FST might not be obvious at first sight so let's consider
a series of problems to illustrate it. Let's say that you want to write
a program that will:
- rename a variable in a source file... without clashing with things
that are not a variable (example: stuff inside a string)
- inline a function/method
- extract a function/method from a series of line of code
- split a class into several classes
- split a file into several modules
- convert your whole code base from one ORM to another
- do custom refactoring operation not implemented by IDE/rope
- implement the class browser of smalltalk for python (the whole one
where you can edit the code of the methods, not just showing code)
It is very likely that you will end up with the awkward feeling of
writing clumpsy weak code that is very likely to break because you
didn't thought about all the annoying special cases and the formatting
keeps bothering you. You may end up playing with
`ast.py <http://docs.python.org/2/library/ast.html>`__ until you realize
that it removes too much information to be suitable for those
situations. You will probably ditch this task as simple too complicated
and really not worth the effort. You are missing a good abstraction that
will take care of all of the code structure and formatting for you so
you can concentrate on your task.
The FST tries to be this abstraction. With it you can now work on a tree
which represents your code with its formatting. Moreover, since it is
the exact representation of your code, modifying it and converting it
back to a string will give you back your code only modified where you
have modified the tree.
Said in another way, what I'm trying to achieve with Baron is a paradigm
change in which writing code that will modify code is now a realist task
that is worth the price (I'm not saying a simple task, but a realistic
one: it's still a complex task).
Other
-----
Having a FST (or at least a good abstraction build on it) also makes it
easier to do code generation and code analysis while those two
operations are already quite feasible (using
`ast.py <http://docs.python.org/2/library/ast.html>`__ and a templating
engine for example).
Some technical details
======================
Baron produces a FST in the form of JSON (and by JSON I mean Python
lists and dicts that can be dumped into JSON) for maximum
interoperability.
Baron FST is quite similar to Python AST with some modifications to be
more intuitive to humans, since Python AST has been made for CPython
interpreter.
Since playing directly with JSON is a bit raw I'm going to build an
abstraction on top of it that will looks like BeautifulSoup/jQuery.
State of the project
====================
Currently, Baron has been tested on the top 100 projects and the FST
converts back exactly into the original source code. So, it can be
considered quite stable, but it is far away from having been battle
tested.
Since the project is very young and no one is already using it except my
project, I'm open to changes of the FST nodes but I will quickly become
conservative once it gets some adoption and will probably accept to
modify it only once or twice in the future with clear indications on how
to migrate.
**Baron is targeting python 2.[67]**. It has not been tested on python3
but should be working for most parts (except the new grammar like yield
from, obviously). Baron **runs** under python 2 and python 3.
Tests
=====
Run either ``py.test tests/`` or ``nosetests`` in the baron directory.
Community
=========
You can reach us on
`irc.freenode.net#baron <https://webchat.freenode.net/?channels=%23baron>`__.
Misc
====
`Old blog post announcing the
project. <http://worlddomination.be/blog/2013/the-baron-project-part-1-what-and-why.html>`__
Not that much up to date.
Changelog
=========
0.6.1 (2015-01-31)
------------------
- fix: the string was having a greedy behavior on grouping the string tokens
surrounding it (for string chains), this ends up creating an inconsistancy in
the way string was grouped in general
- fix: better number parsing handling, everything isn't fixed yet
0.6 (2014-12-11)
----------------
- FST structure modification: def_argument_tuple is no more and all arguments
now have a coherent structure:
* def_argument node name attribute has been renamed to target, like in assign
* target attribute now points to a dict, not to a string
* old name -> string are now target -> name_node
* def_argument_tuple is now a def_argument where target points to a tuple
* this specific tuple will only has name and comma and tuple members (no more
def_argument for name)
- new node: long, before int and long where merged but that was causing problems
0.5 (2014-11-10)
----------------
- rename "funcdef" node to "def" node to be way more intuitive.
0.4 (2014-09-29)
----------------
- new rendering type in the nodes_rendering_order dictionary: string. This
remove an ambiguity where a key could be pointing to a dict or a string, thus
forcing third party tools to do guessing.
0.3.1 (2014-09-04)
------------------
- setup.py wasn't working if wheel wasn't used because the CHANGELOG file
wasn't included in the MANIFEST.in
0.3 (2014-08-21)
----------------
- path becomes a simple list and is easier to deal with
- bounding box allows you to know the left most and right most position
of a node see https://baron.readthedocs.org/en/latest/#bounding-box
- redbaron is classified as supporting python3
https://github.com/Psycojoker/baron/pull/51
- ensure than when a key is a string, it's empty value is an empty string and
not None to avoid breaking libs that use introspection to guess the type of
the key
- key renaming in the FST: "delimiteur" -> "delimiter"
- name_as_name and dotted_as_name node don't have the "as" key anymore as it
was useless (it can be deduce from the state of the "target" key)
- dotted_name node doesn't exist anymore, its existance was unjustified. In
import, from_import and decorator node, it has been replaced from a key to a
dict (with only a list inside of it) to a simple list.
- dumps now accept a strict boolean argument to check the validity of the FST
on dumping, but this isn't that much a public feature and should probably be
changed of API in the futur
- name_as_name and dotted_as_name empty value for target is now an empty string
and not None since this is a string type key
- boundingbox now includes the newlines at the end of a node
- all raised exceptions inherit from a common base exception to ease try/catch
constructions
- Position's left and right functions become properties and thus
attributes
- Position objects can be compared to other Position objects or any
iterables
- make_position and make_bounding_box functions are deleted in favor of
always using the corresponding class' constructor
0.2 (2014-06-11)
----------------
- Baron now provides documentation on https://baron.readthedocs.org
- feature: baron now run in python3 (*but* doesn't implement the full python3
grammar yet) by Pierre Penninckx https://github.com/ibizaman
- feature: drop the usage of ast.py to find print_function, this allow any
version of python to parse any other version of python also by Pierre
Penninckx
- fix: rare bug where a comment end up being confused as an indentation level
- 2 new helpers: show_file and show_node, see https://baron.readthedocs.org/en/latest/#show-file
and https://baron.readthedocs.org/en/latest/#show-node
- new dictionary that provides the informations on how to render a FST node:
nodes_rendering_order see https://baron.readthedocs.org/en/latest/#rendering-the-fst
- new utilities to find a node, see https://baron.readthedocs.org/en/latest/#locate-a-node
- new generic class that provide templates to work on the FST see
https://baron.readthedocs.org/en/latest/#rendering-the-fst
0.1.3 (2014-04-13)
------------------
- set sugar syntaxic notation wasn't handled by the dumper (apparently no one
use this on pypi top 100)
0.1.2 (2014-04-08)
------------------
- baron.dumps now accept a single FST node, it was only working with a list of
FST nodes
- don't add a endl node at the end if not present in the input string
- de-uniformise call_arguments and function_arguments node, this is just
creating more problems that anything else
- fix https://github.com/Psycojoker/redbaron/issues/4
- fix the fact that baron can't parse "{1,}" (but "{1}" is working)
0.1.1 (2014-03-23)
------------------
- It appears that I don't know how to write MANIFEST.in correctly
0.1 (2014-03-22)
----------------
- Init
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
baron-0.6.1.tar.gz
(36.4 kB
view details)
Built Distributions
baron-0.6.1-py2.py3-none-any.whl
(43.3 kB
view details)
baron-0.6.1-py2.7.egg
(37.4 kB
view details)
File details
Details for the file baron-0.6.1.tar.gz
.
File metadata
- Download URL: baron-0.6.1.tar.gz
- Upload date:
- Size: 36.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | acdbf15396597c24f015b4894f6b032d3aa15a784f7f8ae5a69e5b0e89d405f2 |
|
MD5 | aabd5b963d3fcee2531e98b68f5f9b81 |
|
BLAKE2b-256 | 702e831af9d2ecb17e6043938fde0b6f359816ce0e9cc06aa1697c87a4a89531 |
File details
Details for the file baron-0.6.1-py2.py3-none-any.whl
.
File metadata
- Download URL: baron-0.6.1-py2.py3-none-any.whl
- Upload date:
- Size: 43.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45b4e160465e77f26f13445dfc69c5b17fc2ead134a2afecc1f8290a1dc7dc0f |
|
MD5 | 9bbc03e3bcba87fac37a217a5fca668e |
|
BLAKE2b-256 | a076d37b695619f05a205d685448aaea0b699c474bc0376482d677845fb7ea0a |
File details
Details for the file baron-0.6.1-py2.7.egg
.
File metadata
- Download URL: baron-0.6.1-py2.7.egg
- Upload date:
- Size: 37.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad5407c0c2434ba471c1b738b9f1f2a4e464b54be2c30a2bcd32c58514538440 |
|
MD5 | 4183147aa7c962e23fdf4662a0edc399 |
|
BLAKE2b-256 | 11c820f4968a95ee9657711bf142ae3d0252f9ca4cb1b269b8311a18b79ebe60 |