Skip to main content

Export Jupyter notebooks into narrated, cell-labeled text.

Project description

Neurobyte

The Notebook-to-LLM Code Optimizer.

Neurobyte prepares your Jupyter notebooks for maximum context-window efficiency and model comprehension. It is the bridge between your data science work and your AI coding assistant.

Features

  • Context Optimization: Extracts only relevant code and structure, stripping high-noise metadata.
  • LLM-Native Formats: Outputs in XML, JSON, or Narrated Text—formats optimized for Claude, GPT-4, and flexible indexing.
  • Smart Redaction: Automatically sanitizes API keys, secrets, and sensitive table references before they leave your machine.
  • Token efficiency: Estimates token usage (via tiktoken) so you know exactly what fits in your context window.
  • Structured Outlines: Generates a high-level summary and table of contents to help models understand the "big picture" before reading code.

Installation

pip install neurobyte

Quick Start

# Optimize for Claude/Anthropic (XML)
neurobyte export notebook.ipynb --xml

# Optimize for GPT-4 (Text)
neurobyte export notebook.ipynb

Git Integration

Neurobyte provides a pre-commit hook to automatically export notebooks to text for better diffs.

Add this to your .pre-commit-config.yaml:

-   repo: https://github.com/EllordParis/neurobyte
    rev: main
    hooks:
    -   id: neurobyte-export

Development

This project uses make for common development tasks.

# Install dependencies
make install

# Run tests (Pytest)
make test

# Linting & Formatting (Ruff, Black, Mypy)
make lint
make format

Optimization Strategies

Choosing the Right Format

  • XML (--xml): Best for Claude 3 (Opus/Sonnet) and newer models. The structural tags help the model distinguish between code, markdown, and metadata distinctively, reducing hallucinations.
  • JSON (--json): Ideal for RAG pipelines or Agentic workflows. Allows you to index specific cells or function definitions into a vector database.
  • Text (Default): Best for GPT-4 copy-paste or smaller context windows where token overhead must be minimal.

Reducing Context Noise

  1. Redact Aggressively: Use --redact-pattern to strip noisy IDs or non-secret internal references that might confuse the model.
  2. Filter Custom Cells: If your notebook has 50 cells but you only need to optimize the feature engineering method, use --cells 10-25.

Reference & Advanced Capabilities

CLI Commands

# Basic export (Text format)
neurobyte export notebook.ipynb

# XML output (Best for Claude/Anthropic models)
neurobyte export notebook.ipynb --xml

# JSON output (Best for RAG/Indexing)
neurobyte export notebook.ipynb --json

# Check token usage before exporting
neurobyte export notebook.ipynb --estimate-tokens

# Custom output path
neurobyte export notebook.ipynb -o context.xml --xml

# Include markdown cells in output
neurobyte export notebook.ipynb --include-markdown

# Custom redaction pattern
neurobyte export notebook.ipynb --redact-pattern 'client_id=\d+'

# Export specific cell range
neurobyte export notebook.ipynb --cells 1-5

# Disable redaction
neurobyte export notebook.ipynb --no-redact

Python API

import neurobyte as nb
from neurobyte.export import ExportOptions

# Basic export
nb.export_notebook("notebook.ipynb", "export.txt")

# With options
opts = ExportOptions(
    output_format="json",  # or "xml", "txt"
    include_markdown=True,
    redact_secrets=True,
    extra_redact_patterns=["client_id=\\d+"],
    cell_indices=[1, 2, 3],
)
nb.export_notebook("notebook.ipynb", "export.json", options=opts)

# Export from live session (requires IPython)
nb.export_here("session.txt")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neurobyte-0.2.2.tar.gz (18.7 kB view details)

Uploaded Source

File details

Details for the file neurobyte-0.2.2.tar.gz.

File metadata

  • Download URL: neurobyte-0.2.2.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.7

File hashes

Hashes for neurobyte-0.2.2.tar.gz
Algorithm Hash digest
SHA256 0da9d7c514cea3a56bf833cfad48edf6a7a7c013a3107b76f0b54e7609b65a0f
MD5 de95008e211c77f15b8d55a04609eebc
BLAKE2b-256 f07fb843b989fec81d1fc82eaf57b85ea04d0fbd3bc0044f1029ebe02457a237

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page