Skip to main content

Convert notebooks to modular code

Project description

nbmodular

Convert data scientist notebooks with poor modularity to fully modular notebooks and / or python modules.

Roadmap

  • Convert cell code into functions:
    • Inputs are those variables detected in current cell and also detected in previous cells. This solution requires that created variables have unique names across the notebook. However, even if a new variable with the same name is defined inside the cell, the resulting function is still correct.
    • Outputs are, at this moment, all the variables detected in current cell that are also detected in posterior cells.
  • Filter out outputs:
    • Variables detected in current cell, and also detected in previous cells, might not be needed as outputs of the current cell, if the current cell doesn’t modify those variables. To detect potential modifications:
      • AST:
        • If variable appears only on the right of assign statements or in if statements.
        • If it appears only as argument of functions which we know don’t modify the variable, such as print.
      • Comparing variable values before and after cell:
        • Good for small variables where doing a deep copy is not computationally expensive.
      • Using type checker:
        • Making the variable Final and using mypy or other type checker to see if it is modified in the code.
    • Provide hints:
      • Variables that come from other cells might not be needed as output. The remaining are most probably needed.
      • Variables that are modified are clearly needed.

Install

pip install nbmodular

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nbmodular-0.0.1.tar.gz (5.8 kB view hashes)

Uploaded Source

Built Distribution

nbmodular-0.0.1-py3-none-any.whl (5.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page