Python module to handle bytecode
bytearound is a module for assembling and disassembling CPython 2.7.11 bytecode. It provides a representation of bytecode that is easier to modify, create, and inspect than CPython’s internal representation and functionality for going back and forth between this representation and CPython code objects.
An example of how to create code:
from bytearound import ByteAround, Instruction, ops ba = ByteAround([ ops.LOAD_CONST('Hello World!'), ops.PRINT_ITEM(), ops.PRINT_NEWLINE(), ops.LOAD_CONST(None), ops.RETURN_VALUE(), ]) exec(ba.to_code())
And a simple modification:
from bytearound import ByteAround def f(): print 'Hello World!' ba = ByteAround.from_code(f.func_code) for instr in ba: if instr.oparg == 'Hello World!': instr.oparg = 'Goodbye World!' f.func_code = ba.to_code() f()
I designed and wrote bytearound to ensure that co == ByteAround.from_code(co).to_code() always holds–that is, converting a Python code object to the bytearound representation and back should give an identical code object. Ensuring that this invariant holds makes it easier to test the code for correctness. The function debug.check() exists to check this invariant.
Unfortunately, there are a number of quirks in the way that CPython generates code objects that turn out to be hard to replicate. To replicate some of these, I added a pessimize= argument to ByteAround.to_code that attempts to faithfully replicate CPython even when not doing so would be a little more efficient, and I created a custom comparison function that ignores a few other known issues. However, it may not turn out to be possible to remove all minor differences using these approaches. Known issues include:
bytearound has been tested only on Python 2.7.11. Previous releases in the 2.7 series should mostly work, but some changes have been made during the series that impact code objects (e.g. issue 21523).