Skip to main content

Python cross-version byte-code interpeter

Project description

TravisCI CircleCI PyPI Installs Latest Version Supported Python Versions

x-python

This is a CPython bytecode interpreter written Python.

You can use this to:

  • Learn about how the internals of CPython works since this models that

  • Experiment with additional opcodes, or ways to change the run-time environment

  • Use as a sandboxed environment for trying pieces of execution

  • Have one Python program that runs multiple versions of Python bytecode.

  • Use in a dynamic fuzzer or in coholic execution for analysis

The ability to run Python bytecode as far back as 2.4 from Python 3.9 I find pretty neat. (Even more could easily be added).

Also, The sandboxed environment in a debugger I find interesting. (Note: currently environments are not sandboxed that well, but I am working towards that.)

Since there is a separate execution, and traceback stack, inside a debugger you can try things out in the middle of a debug session without effecting the real execution. On the other hand if a sequence of executions works out, it is possible to copy this (under certain circumstances) back into CPython’s execution stack.

Going the other way, I have hooked in trepan3k into this interpreter so you have a pdb/gdb like debugger also with the ability to step bytecode instructions.

To experiment with faster ways to support trace callbacks such as those used in a debugger. In particular added an instruction to support fast breakpoints and breakpointing on a particular instruction that doesn’t happen to be on a line boundary. I believe this could could and should be ported back to CPython and there would be benefit. (Python 3.8 supports the ability to save additional frame information which is where the original opcode is stored. It just needs the BRKPT opcode)

Although this is far in the future, suppose you to add a race detector? It might be easier to prototype it in Python here. (This assumes the interpreter supports threading well, I suspect it doesn’t)

Another unexplored avenue implied above is mixing interpretation and direct CPython execution. In fact, there are bugs so this happens right now, but it will be turned into a feature. Some functions or classes you may want to not run under a slow interpreter while others you do want to run under the interpreter.

Examples:

What to know instructions get run when you write some simple code? Try this:

$ xpython -vc "x, y = 2, 3; x **= y"
INFO:xpython.vm:L. 1   @  0: LOAD_CONST (2, 3)
INFO:xpython.vm:       @  2: UNPACK_SEQUENCE 2
INFO:xpython.vm:       @  4: STORE_NAME (2) x
INFO:xpython.vm:       @  6: STORE_NAME (3) y
INFO:xpython.vm:L. 1   @  8: LOAD_NAME x
INFO:xpython.vm:       @ 10: LOAD_NAME y
INFO:xpython.vm:       @ 12: INPLACE_POWER (2, 3)
INFO:xpython.vm:       @ 14: STORE_NAME (8) x
INFO:xpython.vm:       @ 16: LOAD_CONST None
INFO:xpython.vm:       @ 18: RETURN_VALUE (None)

Option -c is the same as Python’s flag (program passed in as string) and -v is also analogus Python’s flag. Here, it shows the bytecode instructions run.

Note that the disassembly above in the dynamic trace above gives a little more than what you’d see from a static disassembler from Python’s dis module. In particular, the STORE_NAME instructions show the value that is store, e.g. “2” at instruction offset 4 into name x. Similarly INPLACE_POWER shows the operands, 2 and 3, which is how the value 8 is derived as the operand of the next instruction, STORE_NAME.

Want more like the execution stack stack and block stack in addition? Add another v:

$ xpython -vvc "x, y = 2, 3; x **= y"

DEBUG:xpython.vm:make_frame: code=<code object <module> at 0x7f8018507db0, file "<string x, y = 2, 3; x **= y>", line 1>, callargs={}, f_globals=(<class 'dict'>, 140188140947488), f_locals=(<class 'NoneType'>, 93856967704000)
DEBUG:xpython.vm:<Frame at 0x7f80184c1e50: '<string x, y = 2, 3; x **= y>':1 @-1>
DEBUG:xpython.vm:  frame.stack: []
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:L. 1   @  0: LOAD_CONST (2, 3) <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: [(2, 3)]
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @  2: UNPACK_SEQUENCE 2 <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: [3, 2]
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @  4: STORE_NAME (2) x <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: [3]
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @  6: STORE_NAME (3) y <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: []
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:L. 1   @  8: LOAD_NAME x <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: [2]
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @ 10: LOAD_NAME y <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: [2, 3]
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @ 12: INPLACE_POWER (2, 3)  <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: [8]
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @ 14: STORE_NAME (8) x <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: []
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @ 16: LOAD_CONST None <module> in <string x, y = 2, 3; x **= y>:1
DEBUG:xpython.vm:  frame.stack: [None]
DEBUG:xpython.vm:  blocks     : []
INFO:xpython.vm:       @ 18: RETURN_VALUE (None)  <module> in <string x, y = 2, 3; x **= y>:1

Want to see this colorized in a terminal? Use this via trepan-xpy -x: trepan-xpy-example

Suppose you have Python 2.4 bytecode (or some other bytecode) for this, but you are running Python 3.7?

$ xpython -v test/examples/assign-2.4.pyc
INFO:xpython.vm:L. 1   @  0: LOAD_CONST (2, 3)
INFO:xpython.vm:       @  3: UNPACK_SEQUENCE 2
INFO:xpython.vm:       @  6: STORE_NAME (2) x
INFO:xpython.vm:       @  9: STORE_NAME (3) y
INFO:xpython.vm:L. 2   @ 12: LOAD_NAME x
INFO:xpython.vm:       @ 15: LOAD_NAME y
INFO:xpython.vm:       @ 18: INPLACE_POWER (2, 3)
INFO:xpython.vm:       @ 19: STORE_NAME (8) x
INFO:xpython.vm:       @ 22: LOAD_CONST None
INFO:xpython.vm:       @ 25: RETURN_VALUE (None)

Not much has changed here, other then the fact that that in after 3.6 instructions are two bytes instead of 1- or 3-byte instructions.

The above examples show straight-line code, so you see all of the instructions. But don’t confuse this with a disassembler like pydisasm from xdis. The below example, with conditional branching example makes this more clear:

$ xpython -vc "x = 6 if __name__ != '__main__' else 10"
INFO:xpython.vm:L. 1   @  0: LOAD_NAME __name__
INFO:xpython.vm:       @  2: LOAD_CONST __main__
INFO:xpython.vm:       @  4: COMPARE_OP ('__main__', '__main__') !=
INFO:xpython.vm:       @  6: POP_JUMP_IF_FALSE 12
                                               ^^ Note jump below
INFO:xpython.vm:       @ 12: LOAD_CONST 10
INFO:xpython.vm:       @ 14: STORE_NAME (10) x
INFO:xpython.vm:       @ 16: LOAD_CONST None
INFO:xpython.vm:       @ 18: RETURN_VALUE (None)

Want even more status and control? See trepan-xpy.

Status:

Currently bytecode from Python versions 398 - 3.2, and 2.7 - 2.4 are supported. The most recent versions of Python don’t have all opcodes implemented. This is only one of many interests I have, so support may be shoddy. I use funding to help me direct where my attention goes in fixing problems, which are vast in this project.

Byterun, from which this was based on, is awesome. But it cheats in subtle ways.

Want to write a very small interpreter using CPython?

# get code somehow
exec(code)

This cheats in kind of a gross way, but this the kind of cheating goes on in Byterun in a more subtle way. As in the example above which relies on built-in function exec to do all of the work, Byterun relies on various similar sorts of built-in functions to support opcode interpretation. In fact, if the code you were interpreting was the above, Byterun would use its built-in function for running code inside the exec function call, so all of the bytecode that gets run inside code inside code would not seen for interpretation.

Also, built-in functions like exec, and other built-in modules have an effect in the interpreter namespace. So the two namespaces then get intermingled.

One example of this that has been noted is for import. See https://github.com/nedbat/byterun/issues/26. But there are others cases as well. While we haven’t addressed the import issue mentioned in issue 26, we have addressed similar kinds of issues like this.

Some built-in functions and the inpsect module require built-in types like cell, traceback, or frame objects, and they can’t use the corresponding interpreter classes. Here is an example of this in Byterun: class __init__ functions don’t get traced into, because the built-in function __build_class__ is relied on. And __build_class__ needs a native function, not an interpreter-traceable function. See https://github.com/nedbat/byterun/pull/20.

Also Byterun is loose in accepting bytecode opcodes that is invalid for particular Python but may be valid for another. I suppose this is okay since you don’t expect invalid opcodes appearing in valid bytecode. It can however accidentally or erronously appear code that has been obtained via some sort of extraction process, when the extraction process isn’t accruate.

In contrast to Byterun, x-python is more stringent what opcodes it accepts.

Byterun needs the kind of overhaul we have here to be able to scale to support bytecode for more Pythons, and to be able to run bytecode across different versions of Python. Specifically, you can’t rely on Python’s dis module if you expect to run a bytecode other than the bytecode that the interpreter is running, or run newer “wordcode” bytecode on a “byte”-oriented byteocde, or vice versa.

In contrast, x-python there is a clear distinction between the version being interpreted and the version of Python that is running. There is tighter control of opcodes and an opcode’s implementation is kept for each Python version. So we’ll warn early when something is invalid. You can run bytecode back to Python 2.4 using Python 3.9 (largely), which is amazing give that 3.9’s native byte code is 2 bytes per instruction while 2.4’s is 1 or 3 bytes per instruction.

The “largely” part is, as mentioned above, because the interpreter has always made use of Python builtins and libraries, and for the most part these haven’t changed very much. Often, since many of the underlying builtins are the same, the interpreter can (and does) make use interpreter internals. For example, built-in functions like range() are supported this way.

So interpreting bytecode from a newer Python release than the release the Python interpreter is using, is often doable too. Even though Python 2.7 doesn’t support keyword-only arguments or format strings, it can still interpret bytecode created from using these constructs.

That’s possible here because these specific features are more syntactic sugar rather than extensions to the runtime. For example, format strings basically map down to using the format() function which is available on 2.7.

But new features like asynchronous I/O and concurrency primitives are not in the older versions. So those need to be simulated, and that too is a possibility if there is interest or support.

You can run many of the tests that Python uses to test itself, and I do! And most of those work. Right now this program works best on Python up to 3.4 when life in Python was much simpler. It runs over 300 in Python’s test suite for itself without problems. For Python 3.6 the number drops down to about 237; Python 3.9 is worse still.

History

This is a fork of Byterun. which is a pure-Python implementation of a Python bytecode execution virtual machine. Ned Batchelder started it (based on work from Paul Swartz) to get a better understanding of bytecodes so he could fix branch coverage bugs in coverage.py.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

x-python-1.3.6.tar.gz (157.0 kB view details)

Uploaded Source

Built Distributions

x_python-1.3.6-py3.8.egg (216.4 kB view details)

Uploaded Egg

x_python-1.3.6-py3.7.egg (215.6 kB view details)

Uploaded Egg

x_python-1.3.6-py3.6.egg (215.5 kB view details)

Uploaded Egg

x_python-1.3.6-py3.5.egg (218.7 kB view details)

Uploaded Egg

x_python-1.3.6-py3.4.egg (219.4 kB view details)

Uploaded Egg

x_python-1.3.6-py3.3.egg (222.3 kB view details)

Uploaded Egg

x_python-1.3.6-py3.2.egg (217.9 kB view details)

Uploaded Egg

x_python-1.3.6-py2.py3-none-any.whl (102.6 kB view details)

Uploaded Python 2Python 3

x_python-1.3.6-py2.7.egg (212.9 kB view details)

Uploaded Egg

File details

Details for the file x-python-1.3.6.tar.gz.

File metadata

  • Download URL: x-python-1.3.6.tar.gz
  • Upload date:
  • Size: 157.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x-python-1.3.6.tar.gz
Algorithm Hash digest
SHA256 dfaa606a9cf3ccfec1ced9f092d511cd899e573faa4fa37da6b9944bdb44f286
MD5 66e6d2ab2a20d30fe2d4a1cb3536c1d2
BLAKE2b-256 1aadea42198103b3a91ba31ae493545eef94b0c291a0719b3533641ab82f7ea7

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py3.8.egg.

File metadata

  • Download URL: x_python-1.3.6-py3.8.egg
  • Upload date:
  • Size: 216.4 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py3.8.egg
Algorithm Hash digest
SHA256 1915c645ccefc3bb0ead455e965f55e3ec02ca26b737babbf8d929bc7e0f44fc
MD5 ec2f3cc44b9239a7abb103fa2d5d4429
BLAKE2b-256 4b58766f5ddf250543f40996edd4d31a3c00cf7088eb6e54efe2a063c3633f57

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py3.7.egg.

File metadata

  • Download URL: x_python-1.3.6-py3.7.egg
  • Upload date:
  • Size: 215.6 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py3.7.egg
Algorithm Hash digest
SHA256 c486e28238a29ec70b68f5f1fa8a230011d0f4d3a6c99060851363f105fe82af
MD5 75f64ac59bde74bdfad148f663ec4824
BLAKE2b-256 af5ffa0b49972e3ef2fa199fe3114f311996821e8067ffb845d3df9a26d7ae39

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py3.6.egg.

File metadata

  • Download URL: x_python-1.3.6-py3.6.egg
  • Upload date:
  • Size: 215.5 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py3.6.egg
Algorithm Hash digest
SHA256 21a44710112b07ab7f008f0b8518e2868e68aef88352ceefcf84349f17becc5c
MD5 119d08cffa941fe6afee195a1933c3e5
BLAKE2b-256 b0d28c4c3cb84d91a96219761b21b5fd6199ace7346accb30403eb874b773585

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py3.5.egg.

File metadata

  • Download URL: x_python-1.3.6-py3.5.egg
  • Upload date:
  • Size: 218.7 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py3.5.egg
Algorithm Hash digest
SHA256 fc262ec1bd4f77ec7acc5c79b18ae8a510858e22ab58f627154ac8ab31624c1e
MD5 2ab16c52bfad8a8fe99cc7c741dfdee1
BLAKE2b-256 28dc69a59fde906bdb2acd3f8240e176773225f30ec6b56624155b538ff14dbb

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py3.4.egg.

File metadata

  • Download URL: x_python-1.3.6-py3.4.egg
  • Upload date:
  • Size: 219.4 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py3.4.egg
Algorithm Hash digest
SHA256 9b337d1403f57a1699a3d35498d021e1d9e8b6944eeaf19db9871361fb99b3b2
MD5 4ce0b9a361efb1335f0eebcfa900a5b3
BLAKE2b-256 27f86bf8e29d2eb6c73ce7b156b12dee11440c25d87c30c44adc1d4ce720bce8

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py3.3.egg.

File metadata

  • Download URL: x_python-1.3.6-py3.3.egg
  • Upload date:
  • Size: 222.3 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py3.3.egg
Algorithm Hash digest
SHA256 52b30106ee3355dcb6a37907f8fcc833edc7d1f9a31884ee2bfd01080fed3f63
MD5 977aed0026f16a92b9273e229e0a37b8
BLAKE2b-256 748ad07f26d784fd56dda8309167a731c68128a678b5cea3c6be65dc96918916

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py3.2.egg.

File metadata

  • Download URL: x_python-1.3.6-py3.2.egg
  • Upload date:
  • Size: 217.9 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py3.2.egg
Algorithm Hash digest
SHA256 637db8793e65a466659b28106495ec92689fa2bcc3837d9c5f3a6e26ad0ae946
MD5 f26d6e04b5995787e2e28f446a1e2453
BLAKE2b-256 449742d5da95545f105571bd9e0b4f3349f8cd0b06e1024ea1b14273653dc7ea

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py2.py3-none-any.whl.

File metadata

  • Download URL: x_python-1.3.6-py2.py3-none-any.whl
  • Upload date:
  • Size: 102.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 895453d48413882662268a05f381aa93cc3d69947aa9a6df6735a0e32183396e
MD5 b9dc7792bd18bc83a577c9df6cb638ea
BLAKE2b-256 7b2d7af68ea9cd70ec1659ffc6ff38ee0eecc1a217da62edf25dd40151e09984

See more details on using hashes here.

File details

Details for the file x_python-1.3.6-py2.7.egg.

File metadata

  • Download URL: x_python-1.3.6-py2.7.egg
  • Upload date:
  • Size: 212.9 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for x_python-1.3.6-py2.7.egg
Algorithm Hash digest
SHA256 4db0cba473e98b40913a13aaaccdaca4cfb7711426fdb87f1e68e5cda51a4aea
MD5 e5a260e4d4cd314a4345e7b5e61408e8
BLAKE2b-256 80e53bed0d0dfdde74f11aa988f973410fb53fdf0f236ae40be3c2142d50e4a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page