Skip to main content

Converter from Jupyter notebook format to Org format (Emacs text editor), without requirements for any libraries.

Project description

Table of Contents

  1. j2o
  2. Command line usage
  3. review of format for ipynb
  4. code development

j2o

Converter from Jupyter to Org file format without any dependencies.

I don't want to install Jupyter core and nbconver or pandoc with 164 dependencies just to be able to convert simple JSON format, that is why I just wrote coverter from scratch.

Tested for nbformat: 4.2.

TODO: make reverse convrter.

[https://pypi.org/project/j2o/]

Command line usage

usage: j2o myfile.ipynb [-w] [-j myfile.ipynb] [-o myfile.org]

Convert a Jupyter notebook to Org file (Emacs) and vice versa

positional arguments:
  jupfile_              Jupyter file

options:
  -h, --help            show this help message and exit
  -j JUPFILE, --jupfile JUPFILE
                        Jupyter file
  -o ORGFILE, --orgfile ORGFILE
                        Target filename of Org file. If not specified, it will
                        use the filename of the Jupyter file and append .ipynb
  -w, --overwrite       Flag whether to overwrite existing target file.

Other useful projects

p2j https://pypi.org/project/p2j/ https://github.com/remykarem/python2jupyter

https://github.com/jkitchin/ox-ipynb

review of format for ipynb

JSON

{
  cells: [
    cell_type: "code/markdown",
    source: ["\n","\n",""],
    outputs: [{
      text: ["\n", "\n"],
      data: {
        image/png: "base64....",
        text/plain: "image description"}
      }
    ]
  ],
  metadata: {
    kernelspec: {
      language: "python"
    }
  }
}

code development

import sys
import json
import base64
import os

source_filename = 'tests/draw-samples.ipynb'
dir_target = './autoimgs'
org_babel_min_lines_for_block_output = 10 # ob-core.el org-babel-min-lines-for-block-output

PRINT = lambda *x: print("".join(x))
# f = open("out.org", "w")
# PRINT = lambda *x: f.write("".join(x) + '\n') # global
# with open("dates.txt", "w") as f:


try:
    with open(source_filename, "r", encoding="utf-8") as infile:
        myfile = json.load(infile)
except FileNotFoundError:
    print("Source file not found. Specify a valid source file.")
    sys.exit(1)

if not os.path.exists(dir_target):
    os.makedirs(dir_target)

for i, cell in enumerate(myfile["cells"]):
    # -- collect source
    source_lines = cell["source"]
    # -- prepare headers
    header = "#+begin_src python :results output :exports both :session s1"
    tail = "#+end_src"

    # -- collect outputs
    outputs = []
    if "outputs" in cell:
        for j, output in enumerate(cell["outputs"]):
            o = {"text": None, "file_path": None, "data_descr": None}
            # -- test
            if "text" in output:
                outputs_text = output["text"]
                o["text"] = outputs_text
            # -- data
            if "data" in output and "image/png" in output["data"]:
                # - 1) save image 2) insert link to output text 3) format source block header with link
                # - decode image and remember link to file
                b64img = base64.b64decode(output["data"]["image/png"])
                fpath = os.path.join(dir_target, f'{i}_{j}.png')
                o["file_path"] = fpath
                # - save to file
                with open(fpath, 'wb') as b64imgfile:
                    b64imgfile.write(b64img)
                # - add description for link
                if "text/plain" in output["data"]:
                    o["data_descr"] = output["data"]["text/plain"]
                # - change header for image
                if "graphics" not in header: # add only first image to header
                    header = f"#+begin_src python :results file graphics :file {fpath} :exports both :session s1"
            outputs.append(o)

    # -- print source
    if cell["cell_type"] == "markdown":
        source_lines = [s.replace("<br>", "") for s in source_lines]
        PRINT(source_lines[0].replace("#", "*"))
        if len(source_lines) > 1:
            PRINT("".join(source_lines[1:]))
        # PRINT('# asd')
    else: #== "code":
        PRINT(header)
        PRINT("".join(source_lines))
        PRINT(tail)
        PRINT()

    # -- print outputs - text and data
    for k, o in enumerate(outputs):
        # -- test
        # o = {"text": None, "data_file": None, "data_descr": None}
        if o["text"] is not None:
            if len(o["text"]) <= org_babel_min_lines_for_block_output:
                PRINT("#+RESULTS:" + (f"{i}_{k}" if k > 0 else "")) # add index for several RESULT
                PRINT("".join([": " + t for t in o["text"]])) # .startswith()
                PRINT()
            else:
                PRINT("#+RESULTS:" + (f"{i}_{k}" if k > 0 else ""))
                PRINT("#+begin_example")
                for t in o["text"]:
                    if t[0] == '*' or t.startswith("#+"):
                        PRINT("," + t)
                    else:
                        PRINT(t)
                PRINT("#+end_example")
                PRINT()
        if o["file_path"] is not None:
            # if RESULT is ferst we don't add name to it
            if o["text"] is not None and k == 0:
                PRINT("#+RESULTS:" + (f"{i}_{k}" if k > 0 else ""))
            else:
                PRINT("#+RESULTS:" + (f"{i}_{k}" if k > 0 else "")) # add index for several RESULT
            # - PRINT link
            # desc = "" if o["data_descr"] is None else "[" + "".join(o["data_descr"]) + "]"
            desc = "" if o["data_descr"] is None else "".join(o["data_descr"])
            PRINT("[[file:" + o["file_path"] +  "]] " + desc)
            PRINT()
f.close()

#+begin_src python :results output :exports both :session s1
import h5py
import matplotlib.pyplot as plt
import numpy as np
#+end_src

* Чтение файла
#+begin_src python :results output :exports both :session s1
with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    print(list(dataset.keys())[:10])
#+end_src

#+RESULTS:
: ['1609459200', '1609459800', '1609460400', '1609461000', '1609461600', '1609462200', '1609462800', '1609463400', '1609464000', '1609464600']


#+begin_src python :results output :exports both :session s1
with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    print(list(dataset['1609459200'].keys()))
#+end_src

#+RESULTS:
: ['events', 'intensity', 'radial_velocity', 'reflectivity']


#+begin_src python :results output :exports both :session s1
with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    print(f"events shape: {dataset['1609459200']['events'].shape}")
    print(f"intensity shape: {dataset['1609459200']['intensity'].shape}")
    print(f"radial_velocity shape: {dataset['1609459200']['radial_velocity'].shape}")
    print(f"reflectivity shape: {dataset['1609459200']['reflectivity'].shape}")
#+end_src

#+RESULTS:
: events shape: (252, 252)
: intensity shape: (252, 252)
: radial_velocity shape: (10, 252, 252)
: reflectivity shape: (10, 252, 252)


* Визуализация
#+begin_src python :results output :exports both :session s1
events = []
intensity = []
radial_velocity = []
reflectivity = []

with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    timestamps = sorted(dataset.keys())[:6]
    for timestamp in timestamps:
        events.append(np.array(dataset[timestamp]['events']))
        intensity.append(np.array(dataset[timestamp]['intensity']))
        radial_velocity.append(np.array(dataset[timestamp]['radial_velocity']))
        reflectivity.append(np.array(dataset[timestamp]['reflectivity']))

events = np.array(events)
intensity = np.array(intensity)
radial_velocity = np.array(radial_velocity)
reflectivity = np.array(reflectivity)

events[events == -2e6] = -2
events[events == -1e6] = -1
intensity[intensity == -2e6] = -2
intensity[intensity == -1e6] = -1
radial_velocity[radial_velocity == -2e6] = -2
radial_velocity[radial_velocity == -1e6] = -1
reflectivity[reflectivity == -2e6] = -2
reflectivity[reflectivity == -1e6] = -1
#+end_src

** Погодные события
#+begin_src python :results file graphics :file ./autoimgs/8_0.png :exports both :session s1
_, axs = plt.subplots(1, len(events), figsize=(20, 2))
for index in range(len(events)):
    axs[index].imshow(events[index])
    axs[index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/8_0.png]] <Figure size 1440x144 with 6 Axes>

** Интенсивность осадков
#+begin_src python :results file graphics :file ./autoimgs/10_0.png :exports both :session s1
_, axs = plt.subplots(1, len(intensity), figsize=(20, 2))
for index in range(len(intensity)):
    axs[index].imshow(intensity[index])
    axs[index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/10_0.png]] <Figure size 1440x144 with 6 Axes>

** Радиальная скорость по высотам
#+begin_src python :results file graphics :file ./autoimgs/12_0.png :exports both :session s1
_, axs = plt.subplots(10, len(radial_velocity), figsize=(20, 20))
for index in range(len(radial_velocity)):
    for row in range(10):
        if index == 0:
            axs[row, index].set_ylabel(f'{row + 1} км')
        axs[row, index].imshow(radial_velocity[index, row])
    axs[0, index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/12_0.png]] <Figure size 1440x1440 with 60 Axes>

** Отражаемость по высотам
#+begin_src python :results file graphics :file ./autoimgs/14_0.png :exports both :session s1
_, axs = plt.subplots(10, len(reflectivity), figsize=(20, 20))
for index in range(len(reflectivity)):
    for row in range(10):
        if index == 0:
            axs[row, index].set_ylabel(f'{row + 1} км')
        axs[row, index].imshow(reflectivity[index, row])
    axs[0, index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/14_0.png]] <Figure size 1440x1440 with 60 Axes>

Output:

#+begin_src python :results output :exports both :session s1
import h5py
import matplotlib.pyplot as plt
import numpy as np
#+end_src

* Чтение файла
#+begin_src python :results output :exports both :session s1
with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    print(list(dataset.keys())[:10])
#+end_src

#+RESULTS:
: ['1609459200', '1609459800', '1609460400', '1609461000', '1609461600', '1609462200', '1609462800', '1609463400', '1609464000', '1609464600']


#+begin_src python :results output :exports both :session s1
with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    print(list(dataset['1609459200'].keys()))
#+end_src

#+RESULTS:
: ['events', 'intensity', 'radial_velocity', 'reflectivity']


#+begin_src python :results output :exports both :session s1
with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    print(f"events shape: {dataset['1609459200']['events'].shape}")
    print(f"intensity shape: {dataset['1609459200']['intensity'].shape}")
    print(f"radial_velocity shape: {dataset['1609459200']['radial_velocity'].shape}")
    print(f"reflectivity shape: {dataset['1609459200']['reflectivity'].shape}")
#+end_src

#+RESULTS:
: events shape: (252, 252)
: intensity shape: (252, 252)
: radial_velocity shape: (10, 252, 252)
: reflectivity shape: (10, 252, 252)


* Визуализация
#+begin_src python :results output :exports both :session s1
events = []
intensity = []
radial_velocity = []
reflectivity = []

with h5py.File('train/2021-01-train.hdf5', mode='r') as dataset:
    timestamps = sorted(dataset.keys())[:6]
    for timestamp in timestamps:
        events.append(np.array(dataset[timestamp]['events']))
        intensity.append(np.array(dataset[timestamp]['intensity']))
        radial_velocity.append(np.array(dataset[timestamp]['radial_velocity']))
        reflectivity.append(np.array(dataset[timestamp]['reflectivity']))

events = np.array(events)
intensity = np.array(intensity)
radial_velocity = np.array(radial_velocity)
reflectivity = np.array(reflectivity)

events[events == -2e6] = -2
events[events == -1e6] = -1
intensity[intensity == -2e6] = -2
intensity[intensity == -1e6] = -1
radial_velocity[radial_velocity == -2e6] = -2
radial_velocity[radial_velocity == -1e6] = -1
reflectivity[reflectivity == -2e6] = -2
reflectivity[reflectivity == -1e6] = -1
#+end_src

** Погодные события
#+begin_src python :results file graphics :file ./autoimgs/8_0.png :exports both :session s1
_, axs = plt.subplots(1, len(events), figsize=(20, 2))
for index in range(len(events)):
    axs[index].imshow(events[index])
    axs[index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/8_0.png]] <Figure size 1440x144 with 6 Axes>

** Интенсивность осадков
#+begin_src python :results file graphics :file ./autoimgs/10_0.png :exports both :session s1
_, axs = plt.subplots(1, len(intensity), figsize=(20, 2))
for index in range(len(intensity)):
    axs[index].imshow(intensity[index])
    axs[index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/10_0.png]] <Figure size 1440x144 with 6 Axes>

** Радиальная скорость по высотам
#+begin_src python :results file graphics :file ./autoimgs/12_0.png :exports both :session s1
_, axs = plt.subplots(10, len(radial_velocity), figsize=(20, 20))
for index in range(len(radial_velocity)):
    for row in range(10):
        if index == 0:
            axs[row, index].set_ylabel(f'{row + 1} км')
        axs[row, index].imshow(radial_velocity[index, row])
    axs[0, index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/12_0.png]] <Figure size 1440x1440 with 60 Axes>

** Отражаемость по высотам
#+begin_src python :results file graphics :file ./autoimgs/14_0.png :exports both :session s1
_, axs = plt.subplots(10, len(reflectivity), figsize=(20, 20))
for index in range(len(reflectivity)):
    for row in range(10):
        if index == 0:
            axs[row, index].set_ylabel(f'{row + 1} км')
        axs[row, index].imshow(reflectivity[index, row])
    axs[0, index].set_title(timestamps[index])
#+end_src

#+RESULTS:
[[file:./autoimgs/14_0.png]] <Figure size 1440x1440 with 60 Axes>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

j2o-0.0.3.tar.gz (615.1 kB view hashes)

Uploaded Source

Built Distribution

j2o-0.0.3-py3-none-any.whl (7.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page