Skip to main content

p2j: Convert Python scripts to Jupyter notebook with minimal intervention

Project description

p2j - Python to Jupyter Notebook Parser PyPI version

Convert your Python source code to Jupyter notebook with zero intervention.

Here is an example.

# Evaluate the model
model.evaluate()

# Run the model for a while.
# Then we hide the model.
run()
hide()

print(type(data))

# This is considered as a paragraph too
# It has 2 lines of comments

# The data that we are interested in is made of 8x8 images of digits.
# Let's have a look at the first 4 images, which is of course
# stored in the `images` attribute of the dataset.  
images = list(zip(mnist.images))

which translates to the following:

example

Here's another example of a Python source code and its Jupyter notebook after converting.

The purpose of this package is to be able to run a code on Jupyter notebook without having to copy each paragraph of the code into every cell. It's also useful if we want to run our code in Google Colab. This parser isn't perfect, but you would be satisfactorily pleased with what you get.

Contents of this README:

Installation

PyPI

pip install p2j

Python's setup.py

python setup.py install

or

pip install git+https://github.com/remykarem/python2jupyter#egg=p2j

Converting

There are 3 main ways you can get your Jupyter notebook:

Converting a Python script

p2j train.py

and you will get a train.ipynb Jupyter notebook.

Converting a script from the Internet (you need to have curl)

Specify the target filename with a -t.

p2j <(curl https://raw.githubusercontent.com/keras-team/keras/master/examples/mnist_cnn.py) -t myfile.ipynb

Converting an in-line Python script

p2j <(echo "# boilerplate code \n import os") -t myfile2.ipynb

Note:

To run examples from this repository, first clone this repo

git clone https://github.com/raibosome/python2jupyter.git

and after you cd into the project, run

p2j examples/example.py

The p2j/examples/example.py is a Keras tutorial on building an autoencoder for the MNIST dataset, found here.

Command line usage

To see the command line usage, run p2j -h and you will get something like this:

usage: p2j [-h] [-r] [-t TARGET_FILENAME] [-o] source_filename

Convert a Python script to Jupyter notebook

positional arguments:
  source_filename       Python script to parse

optional arguments:
  -h, --help            show this help message and exit
  -r, --reverse         To convert Jupyter to Python script
  -t TARGET_FILENAME, --target_filename TARGET_FILENAME
                        Target filename of Jupyter notebook. If not specified,
                        it will use the filename of the Python script and
                        append .ipynb
  -o, --overwrite       Flag whether to overwrite existing target file.
                        Defaults to false

Requirements

  • Python >= 3.6

No third party libraries are used.

Tests

Tested on macOS 10.14.3 with Python 3.6.

Code format

There is no specific format that you should follow, but generally the parser assumes a format where your code is paragraphed. Check out some examples of well-documented code (and from which you can test!):

How it works

Jupyter notebooks are just JSON files, like below. A Python script is read line by line and a dictionary of key-value pairs are generated along the way, using a set of rules. Finally, this dictionary is dumped as a JSON file whose file extension is .ipynb.

{
    "cells": [
        {
            "cell_type": "markdown",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Import standard functions"
            ]
        },
        {
            "cell_type": "code",
            "metadata": {},
            "source": [
                "import os"
            ]
        },
    ],
    "metadata": {},
    "nbformat": 4,
    "nbformat_minor": 2
}

There are 4 basic rules (and exceptions) that I follow to parse the Python script.

1. Code or comment

Firstly, any line that starts with a # is marked as a comment. So this will be a markdown cell in the Jupyter notebook. Everything else that does not start with this character is considered code, so this goes to the code cell. There are of course exceptions.

This is a comment

# Train for 4 epochs

and this is code

model.train(4)

2. Blocks of code and comment

Secondly, code or comment can occur in blocks. A block of comment is several consecutive lines of comments that start with #. Similarly, several consecutive lines of codes that do not start with # will be considered as 'a block of code'. This rule is important because we want to ensure that a block of code or comment stays in one cell.

This is a block of comment

# Load the model and
# train for 4 epochs and
# lastly we save the model

and this is a block of code

model.load()
model.train(4)
model.save()

3. Paragraph

Thirdly, I assume that everyone writes his/her script in paragraphs, where each paragraph represents an idea. In a paragraph, there can be code or comments or both.

The following are 5 examples of paragraphs.

# Evaluate the model
model.evaluate()

# Run the model for a while.
# Then we hide the model.
run()
hide()

print(type(data))

# This is considered as a paragraph too
# It has 2 lines of comments

# The data that we are interested in is made of 8x8 images of digits.
# Let's have a look at the first 4 images, which is of course
# stored in the `images` attribute of the dataset.  
images = list(zip(mnist.images))

which translates to the following:

example

4. Indentation

Any line of code or comment that is indented by a multiple of 4 spaces is considered code, and will stay in the same code cell as the previous non-empty line. This ensures that function and class definitions, loops and multi-line code stay in one cell.

5. Exceptions

Now we handle the exceptions to the above-mentioned rules.

  • Docstrings are considered as markdown cells, only if they are not indented.

  • Lines that begin with #pylint or # pylint are Pylint directives and are kept as code cells.

  • Shebang is considered as a code cell, eg. #!/usr/bin/env python3.

  • Encodings like # -*- coding: utf-8 -*- are also considered as code cells.

Feedback and pull requests

If you do like this, star me maybe? Pull requests are very much encouraged! Slide into my DM with suggestions too!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

p2j-1.3.2.tar.gz (124.7 kB view details)

Uploaded Source

Built Distribution

p2j-1.3.2-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file p2j-1.3.2.tar.gz.

File metadata

  • Download URL: p2j-1.3.2.tar.gz
  • Upload date:
  • Size: 124.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for p2j-1.3.2.tar.gz
Algorithm Hash digest
SHA256 6a492350953a87ceaf190b13141242f604b597efd668c2b026241c4a4f4777f5
MD5 28e54d926f8ae5c19760329d005867f5
BLAKE2b-256 285d90bd29ff487a45fd1c5162a4e344e279a4da3fafd271f77a69964b3be2c1

See more details on using hashes here.

File details

Details for the file p2j-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: p2j-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for p2j-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4203eaf03c66a01148d86229670bd64f8e12fdbc9d65ab7cb183074d094d2cf3
MD5 9aaef491572efdd9ea44480489334a88
BLAKE2b-256 ee5b98fd10f7ba9fdfc99613c99040a87285dfc68392e5a059c22387ea814e39

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page