Skip to main content

A general-purpose JIT for CPython.

Project description

DIO-JIT: General-purpose Python JIT

中文README PyPI version shields.io JIT

Important:

  1. DIO-JIT now works for Python >= 3.8. We heavily rely on the LOAD_METHOD bytecode instruction.
  2. DIO-JIT is not production-ready. a large number of specialisation rules are required to make DIO-JIT batteries-included.
  3. This document is mainly provided for prospective developers. Users are not required to write any specialisation rules, which means that users need to learn nothing but @jit.jit and jit.jit_spec_call.

Benchmark

Item PY38 JIT PY38 PY39 JIT PY39
BF 265.74 134.23 244.50 140.34
append3 23.94 10.70 22.29 11.21
DNA READ 16.96 14.82 15.03 14.38
fib(15) 11.63 1.54 10.41 1.51
hypot(str, str) 6.19 3.87 6.53 4.29
selectsort 46.95 33.88 38.71 29.49
trans 24.22 7.79 23.23 7.71

The bechmark item "DNA READ" does not show a significant performance gain, this is because "DNA READ" heavily uses bytearray and bytes, whose specialised C-APIs are not exposed. In this case, although the JIT can infer the types, we have to fall back to CPython's default behaviour, or even worse: after all, the interpreter can access internal things, while we cannot.

Install Instructions

Step 1: Install Julia as an in-process native code compiler for DIO-JIT

There are several options for you to install Julia:

  • scoop (Windows)

  • julialang.org (recommended for Windows users)

  • jill.py:

    pip install jill && jill install 1.6 --upstream Official

  • jill (Mac and Linux only!):

    bash -ci "$(curl -fsSL https://raw.githubusercontent.com/abelsiqueira/jill/master/jill.sh)"

Step 2: Install DIO.jl in Julia

Type julia and open the REPL, then

julia>
# press ]
pkg> add https://github.com/thautwarm/DIO.jl
# press backspace
julia> using DIO # precompile

Step 3: Install Python Package

pip install git+https://github.com/thautwarm/diojit

How to fetch latest DIO-JIT?(if you have installed DIO)

pip install -U diojit
julia -e "using Pkg; Pkg.update(string(:DIO));using DIO"

Usage from Python side is quite similar to that from Numba.

import diojit
from math import sqrt
@diojit.jit(fixed_references=["sqrt", "str", "int", "isinstance"])
def hypot(x, y):
    if isinstance(x, str):
        x = int(x)

    if isinstance(y, str):
        y = int(y)

    return sqrt(x ** 2 + y ** 2)

specialized_hypot = diojit.jit_spec_call(hypot, diojit.oftype(int), diojit.oftype(int))
specialized_hypot(1, 2) # 30% faster than CPython

DIO-JIT is a method JIT driven by abstract interpretation and call-site specialisation. Abstract interpretation is done by the compiler (jit.absint.abs). You can register new specialisation rules(and see examples) from (jit.absint.prescr).

We're able to optimise anything!

Contribution Example: Add a specialisation rule for list.append

  1. Python Side:
import diojit as jit
import timeit
jit.create_shape(list, oop=True)
@jit.register(list, attr="append")
def list_append_analysis(self: jit.Judge, *args: jit.AbsVal):
    if len(args) != 2:
        # rollback to CPython's default code
        return NotImplemented
    lst, elt = args

    return jit.CallSpec(
        instance=None,  # return value is not static
        e_call=jit.S(jit.intrinsic("PyList_Append"))(lst, elt),
        possibly_return_types=tuple({jit.S(type(None))}),
    )
  1. Julia Side:

You can either do step 2) at Python side(for users other than DIO-JIT developers):

import diojit as jit
from jit.runtime.julia_rt import jl_eval
jl_implemented_intrinsic = b"""
function PyList_Append(lst::Ptr, elt::PyPtr)
    if ccall(PyAPI.PyList_Append, Cint, (PyPtr, PyPtr), lst, elt) == -1
        return Py_NULL
    end
    nothing # automatically maps to a Python None
end
DIO.DIO_ExceptCode(::typeof(PyList_Append)) != Py_NULL
"""
jl_eval(jl_implemented_intrinsic)

You immediately get a >100% time speed up:

@jit.jit
def append3(xs, x):
    xs.append(x)
    xs.append(x)
    xs.append(x)

jit_append3 = jit.jit_spec_call(append3, jit.oftype(list), jit.Top) # 'Top' means 'Any'
xs = [1]
jit_append3(xs, 3)

print("test jit_append3, [1] append 3 for 3 times:", xs)
# test jit func, [1] append 3 for 3 times: [1, 3, 3, 3]

xs = []
%timeit append3(xs, 1)
# 293 ns ± 26.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

xs = []
%timeit jit_append3(xs, 1)
# 142 ns ± 14.9 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Why Julia?

We don't want to maintain a C compiler, and calling gcc or others will introduce cross-process IO, which is slow. We prefer compiling JITed code with LLVM, and Julia is quite a killer tool for this use case.

Current Limitations

  1. Support for *varargs and **kwargs are not ready: we do can immediately support them with very tiny JIT performance gain, but considering backward compatibility we decide not to do this.

  2. Exception handling is not yet supported inside JIT functions.

    Why?

    We haven't implemented the translation from exception handling bytecode to untyped DIO IR (jit.absint.abs.In_Stmt).

    Will support?

    Yes.

    In fact, now a callsite in any JIT function can raise an exception. It will not be handled by JIT functions, instead, it is lifted up to the root call, which is a pure Python call.

    Exception handling will be supported when we have efforts on translating CPython bytecode about exception handling into untyped DIO IR (jit.absint.abs.In_Stmt).

    P.S: This will be finished simultaneously with the support for for loop.

  3. Support for for loop is missing.

    Why?

    Firstly, in CPython, for loop relies on exception handling, which is not supported yet.

    Secondly, we're considering a fast path for for loop, maybe proposing a __citer__ protocol for faster iteration for JIT functions, which requires communications with Python developers.

    Will support?

    Yes.

    This will be finished simultaneously with support for exception handling (faster for loop might come later).

  4. Closure support is missing.

    Why?

    In imperative languages, closures use cell structures to achieve mutable free/cell variables.

    However, a writable cell makes it hard to optimise in a dynamic language.

    We recommend using types.MethodType to create immutable closures,which can be highly optimised in DIO-JIT(near future).

    import types
    def f(freevars, z):
            x, y = freevars
            return x + y + z
    
    def hof(x, y):
        return types.MethodType(f, (x, y))
    

    Will support?

    Still yes. However, don't expect much about the performance gain for Python's vanilla closures.

  5. Specifying fixed global references(@diojit.jit(fixed_references=['isinstance', 'str', ...]) too annoying?

    Sorry, you have to. We are thinking about the possibility about automatic JIT covering all existing CPython code, but the biggest impediment is the volatile global variables.

    Possibility?

    Recently we found CPython's newly(:)) added feature Dict.ma_version_tag might be used to automatically notifying JITed functions to re-compile when the global references change.

    More research is required.

Contributions

  1. Add more prescribed specialisation rules at jit.absint.prescr.
  2. TODO

Benchmarks

Check benchmarks directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

diojit-0.1.5-py3-none-any.whl (31.2 kB view details)

Uploaded Python 3

File details

Details for the file diojit-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: diojit-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 31.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/51.1.2.post20210112 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.1

File hashes

Hashes for diojit-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3e560ef03de632e70420728bd31aa3cf16a2254cba8dcc3b453ff47917f16041
MD5 55d82ab80bcffa557102bec35febd763
BLAKE2b-256 1dceedb67d8f02b876ae4fe4eefab5584e813d0935af8f87f1b21808cdd900bd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page