Skip to main content

Tools to extract content from ipython (jupyter) notebooks

Project description

sheet

Tools to extract content from ipython (jupyter) notebooks

To install: pip install sheet

Examples

Getting the filepath of the current notebook

filepath = get_path_of_current_notebook()
# Example: '~/my_notebook_folder/some_notebook.ipynb'

Access to cells of a notebook given its filepath

from sheet.contents import get_ipynb_cells, get_ipynb_cells_source
filepath = '~/my_notebook_folder/some_notebook.ipynb'

cells = get_ipynb_cells(filepath)
assert type(cells), type(cells[0]) == (list, dict)

cells = get_ipynb_cells_source(filepath)
assert type(cells), type(cells[0]) == (list, str)
from sheet.contents import get_ipynb_cells_full_text
notebook_text = get_ipynb_cells_full_text(filepath)
print(notebook_text)

Search the cells of a single notebook

Index and search the cells of a notebook

from sheet import SingleNotebookSearch

search = SingleNotebookSearch(src=None)  # if no filepath (src) to a notebook is given, will use the "current notebook"

result_indices = search('lines iterize')
print(result_indices)
print("\n---- Contents of first hit ----")
print(search[result_indices[0]])  # print the contents of the first result
[70 225 226 198 199 196 200 201 193 197]

---- Contents of first hit ----
process_wf = Line(
    partial(fixed_step_chunker, chk_size=DFLT_CHK_SIZE),
    iterize(process_chk)
)

Search the contents of the notebooks under a directory

from sheet.contents import SearchNotebooks

search = SearchNotebooks('~/my_notebooks_folder', max_levels=0)  # enter max_levels=None for full recursive
search('bayesian')
array(['Spyn 01 - Potentials.ipynb',
       'Bayes 01 - Potentials-Only explanation.ipynb', 'taped.ipynb',
       'separation of concerns - how py2store does it.ipynb',
       'equate.ipynb', 'peruse.ipynb',
       'hum, taped, lined -- feeding audio to a pipeline.ipynb',
       'owner.ipynb', 'best of 2020.ipynb',
       'Bayes 02 - Potentials - And drug data example.ipynb'],
      dtype=object)

Okay, we have a list of notebooks that match our query (i.e. the highest average alignment to our query -- not just keyword matching!), but what cells in particular have the highest relevance?

Well, we can now peruse our notebook at that level, with a notebook cells searcher. (Note: You can combine both to make a cell-level searcher from the folder level.)

ss = search.search_notebook('Spyn 01 - Potentials.ipynb')
ss('bayesian')
array([['Spyn 01 - Potentials.ipynb', 6],
       ['Spyn 01 - Potentials.ipynb', 2],
       ['Spyn 01 - Potentials.ipynb', 71],
       ['Spyn 01 - Potentials.ipynb', 88],
       ['Spyn 01 - Potentials.ipynb', 91],
       ['Spyn 01 - Potentials.ipynb', 84],
       ['Spyn 01 - Potentials.ipynb', 85],
       ['Spyn 01 - Potentials.ipynb', 86],
       ['Spyn 01 - Potentials.ipynb', 87],
       ['Spyn 01 - Potentials.ipynb', 82]], dtype=object)
ss['Spyn 01 - Potentials.ipynb', 6]
'# Potentials - A key data structure to Discrete Bayesian Inference'
ss['Spyn 01 - Potentials.ipynb', 87]
'### Making a few potentials from pts data'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sheet-0.0.5.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

sheet-0.0.5-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file sheet-0.0.5.tar.gz.

File metadata

  • Download URL: sheet-0.0.5.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for sheet-0.0.5.tar.gz
Algorithm Hash digest
SHA256 f276ab2c36068d5e99dd1e7fc1e0c67ed47be3caa6eb5f73929771215f991d09
MD5 43adee3a78eb4463e2aa18822ca21ca7
BLAKE2b-256 e879bc23cc46d73b82400ef313874b2c9446c313eca25733cf4a562f748ed8bf

See more details on using hashes here.

File details

Details for the file sheet-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: sheet-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for sheet-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 82f8e009d28c63dece5871c6d05efb89a8b3a7d813124453b0d4cee9af6119c1
MD5 ca15ee686f4e45664a96b217e82ae657
BLAKE2b-256 b91b08fad8a1a74c069fd1dd9f0fd082273740ba13dc3ed907a65d59cc897b56

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page