Skip to main content

A tool designed to clear all outputs from a Jupyter Notebook using nbconvert’s ClearOutputPreprocessor, preparing the notebook for sharing or version control.

Project description

Swamauri Logo

PyPI - Downloads Hits PyPI - Python Version PyPI - License PyPI - swarmauri_tool_jupyterclearoutput


Swarmauri Tool Jupyterclearoutput

JupyterClearOutputTool is a component designed for removing outputs from cells in a Jupyter Notebook. This ensures the notebook remains uncluttered, making it ideal for sharing and version control. It preserves the cell code and metadata, resets the execution counts, and logs the operation for auditing purposes, returning a cleaned notebook data structure.

Installation

Install this package via PyPI:

pip install swarmauri_tool_jupyterclearoutput

This package requires Python 3.10 or newer. By installing swarmauri_tool_jupyterclearoutput, all additional dependencies (such as nbconvert, swarmauri_core, and swarmauri_base) will be installed automatically.

Usage

After installation, import and instantiate JupyterClearOutputTool to clear cell outputs from an in-memory notebook. You can load your notebook into a Python dictionary (for example, using json.load on a .ipynb file) and pass that dictionary to the tool.

Example usage:


from swarmauri_tool_jupyterclearoutput import JupyterClearOutputTool

# Suppose 'notebook_data' is a dictionary representing a Jupyter Notebook (e.g., loaded from a .ipynb file)
notebook_data = {
    "cells": [
        {
            "cell_type": "code",
            "execution_count": 1,
            "metadata": {},
            "outputs": [
                {"output_type": "stream", "name": "stdout", "text": ["Hello World\n"]}
            ],
            "source": ["print('Hello World')"]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": ["# This is a markdown cell"]
        }
    ],
    "metadata": {},
    "nbformat": 4,
    "nbformat_minor": 5
}

tool = JupyterClearOutputTool()
clean_notebook = tool(notebook_data)

At this point, 'clean_notebook' contains the same notebook but with outputs cleared.


You can then save the modified resulting dictionary back to a .ipynb file. This ensures the notebook is shared without potentially lengthy or sensitive outputs included.

Dependencies

This package relies on: • Python 3.10 or higher
• swarmauri_core
• swarmauri_base
• nbconvert

These dependencies are automatically managed by the package installer. No manual installation steps beyond "pip install swarmauri_tool_jupyterclearoutput" are required.

Example Code Implementation

Below is the fully functional implementation for the core tool code:


"""
JupyterClearOutputTool.py

This module defines the JupyterClearOutputTool, a component that removes all outputs from a
Jupyter notebook while preserving cell code and metadata. It handles notebooks of varying
sizes and versions efficiently, logs the clear operation for auditing, and returns a clean
NotebookNode for further use.
"""

import logging
from typing import List, Dict, Any, Literal
from pydantic import Field
from swarmauri_standard.tools.Parameter import Parameter
from swarmauri_base.tools.ToolBase import ToolBase
from swarmauri_core.ComponentBase import ComponentBase

logger = logging.getLogger(__name__)


@ComponentBase.register_type(ToolBase, "JupyterClearOutputTool")
class JupyterClearOutputTool(ToolBase):
    """
    JupyterClearOutputTool is a tool that removes the outputs from code cells in a Jupyter notebook.
    It preserves the cell code and metadata, ensures compatibility with various notebook versions,
    and returns a cleaned notebook data structure for further use.

    Attributes:
        version (str): The version of the JupyterClearOutputTool.
        parameters (List[Parameter]): A list of parameters required for clearing notebook outputs.
        name (str): The name of the tool.
        description (str): A brief description of the tool's functionality.
        type (Literal["JupyterClearOutputTool"]): The type identifier for this tool.
    """
    version: str = "1.0.0"
    parameters: List[Parameter] = Field(
        default_factory=lambda: [
            Parameter(
                name="notebook_data",
                type="object",
                description="A dictionary that represents the Jupyter Notebook to clear outputs from.",
                required=True,
            ),
        ]
    )
    name: str = "JupyterClearOutputTool"
    description: str = "Removes outputs from a Jupyter notebook while preserving code and metadata."
    type: Literal["JupyterClearOutputTool"] = "JupyterClearOutputTool"

    def __call__(self, notebook_data: Dict[str, Any]) -> Dict[str, Any]:
        """
        Removes all outputs from the provided Jupyter notebook data structure. Preserves
        cell code and metadata, and resets the execution counts. Logs the operation for auditing
        and returns the cleaned notebook.

        Args:
            notebook_data (Dict[str, Any]): A dictionary representing the Jupyter Notebook.

        Returns:
            Dict[str, Any]: The cleaned Jupyter Notebook dictionary with all cell outputs removed.

        Example:
            >>> tool = JupyterClearOutputTool()
            >>> clean_notebook = tool(notebook_data)
        """
        cells_cleared = 0

        # Iterate over all cells in the notebook and remove their outputs if they are code cells.
        for cell in notebook_data.get("cells", []):
            if cell.get("cell_type") == "code":
                if "outputs" in cell:
                    cell["outputs"] = []
                cell["execution_count"] = None
                cells_cleared += 1

        # Log the number of cells cleared for auditing.
        logger.info("Cleared outputs from %d cells in the notebook.", cells_cleared)

        # Return the cleaned notebook data structure.
        return notebook_data

License

This project is licensed under the Apache-2.0 License. For additional details, refer to the LICENSE file (if available).

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file swarmauri_tool_jupyterclearoutput-0.7.4.dev20.tar.gz.

File metadata

File hashes

Hashes for swarmauri_tool_jupyterclearoutput-0.7.4.dev20.tar.gz
Algorithm Hash digest
SHA256 f93b988843e016f07eb37aa4939ec67469d5cc34ab58199c0f4ec2ba63d92414
MD5 4f3464a4241dcfe02073ee7e35aaf5c7
BLAKE2b-256 ad6becca559497113f3489a53a3d44c92b17a203a8aec827582cfe4e54e813eb

See more details on using hashes here.

File details

Details for the file swarmauri_tool_jupyterclearoutput-0.7.4.dev20-py3-none-any.whl.

File metadata

File hashes

Hashes for swarmauri_tool_jupyterclearoutput-0.7.4.dev20-py3-none-any.whl
Algorithm Hash digest
SHA256 56c478682886e7b0d92b2be57f2326b626fdea86a024430c79a9d622da066ecc
MD5 2af382c16c1474facc5cb92b752b7c53
BLAKE2b-256 07d7812c18e869f4dfa5e01b3d0c7779ea718c4a0752ae4530454dc788f01cba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page