Skip to main content

A toy compiler that can convert Python scripts into pickle bytecode.

Project description

Pickora 🐰

A small compiler that can convert Python scripts to pickle bytecode.

Requirements

  • Python 3.8+

No third-party modules are required.

Quick Start

Installation

Using pip:

$ pip install pickora

From source:

$ git clone https://github.com/splitline/Pickora.git
$ cd Pickora
$ python setup.py install

Basic Usage

Compile from a string:

$ pickora -c 'from builtins import print; print("Hello, world!")' -o output.pkl
$ python -m pickle output.pkl # load the pickle bytecode
Hello, world!
None

Compile from a file:

$ echo 'from builtins import print; print("Hello, world!")' > hello.py
$ pickora hello.py # output compiled pickle bytecode to stdout directly
b'\x80\x04\x95(\x00\x00\x00\x00\x00\x00\x00\x8c\x08builtins\x8c\x05print\x93\x94\x94h\x01\x8c\rHello, world!\x85R.'

Usage

usage: pickora [-h] [-c CODE] [-p PROTOCOL] [-e] [-O] [-o OUTPUT] [-d] [-r]
               [-f {repr,raw,hex,base64,none}]
               [source]

A toy compiler that can convert Python scripts into pickle bytecode.

positional arguments:
  source                source code file

optional arguments:
  -h, --help            show this help message and exit
  -c CODE, --code CODE  source code string
  -p PROTOCOL, --protocol PROTOCOL
                        pickle protocol
  -e, --extended        enable extended syntax (trigger find_class)
  -O, --optimize        optimize pickle bytecode (with pickletools.optimize)
  -o OUTPUT, --output OUTPUT
                        output file
  -d, --disassemble     disassemble pickle bytecode
  -r, --run             run (load) pickle bytecode immediately
  -f {repr,raw,hex,base64,none}, --format {repr,raw,hex,base64,none}
                        output format, none means no output

Basic usage: `pickora samples/hello.py` or `pickora --code 'print("Hello, world!")' --extended`

Supported Syntax

Basic Syntax (achived by only using pickle opcodes)

  • Basic types: int, float, bytes, string, dict, list, set, tuple, bool, None
  • Assignment: val = dict_['x'] = obj.attr = 'meow'
  • Augmented assignment: x += 1
  • Named assignment: (x := 1337)
  • Unpacking: a, b, c = 1, 2, 3
  • Function call: f(arg1, arg2)
    • Doesn't support keyword argument.
  • Import
    • from module import things (directly using STACK_GLOBALS bytecode)
  • Macros (see below for more details)
    • STACK_GLOBAL
    • GLOBAL
    • INST
    • OBJ
    • NEWOBJ
    • NEWOBJ_EX
    • BUILD

Extended Syntax (enabled by -e / --extended option)

Note: All extended syntaxes are implemented by importing other built-in modules. So with this option will trigger find_class when loading the pickle bytecode.

  • Attributes: obj.attr (using builtins.getattr only when you need to "load" an attribute)
  • Operators (using operator module)
    • Binary operators: +, -, *, / etc.
    • Unary operators: not, ~, +val, -val
    • Compare: 0 < 3 > 2 == 2 > 1 (using builtins.all for chained comparing)
    • Subscript: list_[1:3], dict_['key'] (using builtins.slice for slice)
    • Boolean operators (using builtins.next, builtins.filter)
      • and: using operator.not_
      • or: using operator.truth
      • (a or b or c) -> next(filter(truth, (a, b, c)), c)
      • (a and b and c) -> next(filter(not_, (a, b, c)), c)
  • Import
    • import module (using importlib.import_module)
  • Lambda
    • lambda x,y=1: x+y
    • Using types.CodeType and types.FunctionType
    • [Known bug] If any global variables are changed after the lambda definition, the lambda function won't see those changes.

Macros

There are currently 4 macros available: STACK_GLOBAL, GLOBAL, INST and BUILD.

STACK_GLOBAL(modname: Any, name: Any)

Example:

function_name = input("> ") # > system
func = STACK_GLOBAL('os', function_name) # <built-in function system>
func("date") # Tue Jan 13 33:33:37 UTC 2077

Behaviour:

  1. PUSH modname
  2. PUSH name
  3. STACK_GLOBAL

GLOBAL(modname: str, name: str)

Example:

func = GLOBAL("os", "system") # <built-in function system>
func("date") # Tue Jan 13 33:33:37 UTC 2077

Behaviour:

Simply write this piece of bytecode: f"c{modname}\n{name}\n"

INST(modname: str, name: str, args: tuple[Any])

Example:

command = input("cmd> ") # cmd> date
INST("os", "system", (command,)) # Tue Jan 13 33:33:37 UTC 2077

Behaviour:

  1. PUSH a MARK
  2. PUSH args by order
  3. Run this piece of bytecode: f'i{modname}\n{name}\n'

BUILD(inst: Any, state: Any, slotstate: Any)

state is for inst.__setstate__(state) and slotstate is for setting attributes.

Example:

from collections import _collections_abc
BUILD(_collections_abc, None, {'__all__': ['ChainMap', 'Counter', 'OrderedDict']})

Behaviour:

  1. PUSH inst
  2. PUSH (state, slotstate) (tuple)
  3. PUSH BUILD

FAQ

What is pickle?

RTFM.

Why?

It's cool.

Is it useful?

No, not at all, it's definitely useless.

So, is this garbage?

Yep, it's cool garbage.

Would it support syntaxes like if / while / for ?

No. All pickle can do is just simply define a variable or call a function, so this kind of syntax wouldn't exist.

But if you want to do things like:

ans = input("Yes/No: ")
if ans == 'Yes':
  print("Great!")
elif ans == 'No':
  exit()

It's still achievable! You can rewrite your code like this:

from functools import partial
condition = {'Yes': partial(print, 'Great!'), 'No': exit}
ans = input("Yes/No: ")
condition.get(ans, repr)()

ta-da!

For the loop syntax, you can try to use map / starmap / reduce etc .

And yes, you are right, it's functional programming time!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pickora-1.0.1.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

pickora-1.0.1-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file pickora-1.0.1.tar.gz.

File metadata

  • Download URL: pickora-1.0.1.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for pickora-1.0.1.tar.gz
Algorithm Hash digest
SHA256 98cec52e66c4e7fc45d824e77b56a31af7cb03239855fecd35ba8cca1cbf40d0
MD5 dc7c67ea51e3eed70b9c1f84b992c466
BLAKE2b-256 dc7bb19dada2ddfd6705089d6a268351ddf24b550af1cdaca3ef3c32ba89aa14

See more details on using hashes here.

File details

Details for the file pickora-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: pickora-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for pickora-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 158d1d6a5df22e0905450a9147fbb63ac29795c30c42f4a6362efe51c8fee6d4
MD5 95999f6edd5f7045fb066796b8ca6e0e
BLAKE2b-256 5de01fecb56973c4017492b88c7552e17f8b662afe51e493e4010108e7a930c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page