A JupyterLab extension with an AI-powered data analysis agent sidebar Panel.

These details have not been verified by PyPI

Project links

Project description

jupyterlab-data-agent

一个 JupyterLab 4 扩展插件，提供侧边栏 AI 对话面板，通过大模型（OpenAI/DeepSeek 兼容接口）驱动，支持 数据分析和信贷风险建模 场景下的多步智能体操作。

核心特性

特性	说明
ReAct 智能体	多步推理循环（最多 25 轮），模型可反复调用工具、观察结果、修正错误
安全内核隔离	Python 代码在独立 Jupyter Kernel 中执行，与 JupyterLab 主进程完全隔离
风险评估工具集	内置 IV/KS/Gini/WoE/PSI/Vintage 等信贷风控分析函数
资源感知	加载 CSV 前自动评估内存占用，超大文件建议分块/采样策略
环境感知	自动扫描内核中已安装的数据科学包，优先使用已有工具
操作确认	`rm`/`pip install`/`git push` 等危险操作弹出确认对话框
一键中断	运行中随时点击 Stop 按钮中止智能体思考
Markdown 渲染	支持表格、标题、列表、代码块、分割线、粗斜体等完整格式

架构概览

┌──────────────────────────────────────────────────┐
│               JupyterLab Frontend                 │
│  ┌───────────────┐  ┌─────────────────────────┐  │
│  │  Data Agent   │  │  Settings Dialog        │  │
│  │  Side Panel   │  │  (API Key / Model)      │  │
│  │  (React/TSX)  │  └─────────────────────────┘  │
│  └───────┬───────┘                                │
│          │ SSE (Server-Sent Events)               │
├──────────┼────────────────────────────────────────┤
│          ▼              Jupyter Server            │
│  ┌───────────────────────────────────────────┐   │
│  │  StreamingChatHandler (/chat-stream)      │   │
│  │  ConfirmHandler      (/confirm)           │   │
│  └───────────────┬───────────────────────────┘   │
│                  ▼                                │
│  ┌───────────────────────────────────────────┐   │
│  │         DataAnalysisAgent                  │   │
│  │  ReAct loop:                               │   │
│  │    Model ⇄ execute_code / execute_shell    │   │
│  │           assess_dataset                   │   │
│  │           inspect_environment              │   │
│  └──────┬────────────────────┬───────────────┘   │
│         ▼                    ▼                    │
│  ┌──────────────┐   ┌───────────────────┐       │
│  │ Isolated     │   │ ShellExecutor     │       │
│  │ Kernel       │   │ (subprocess)      │       │
│  │ (jupyter_cli │   └───────────────────┘       │
│  │  ent)        │                                │
│  └──────────────┘                                │
│         │                                         │
│         ▼                                         │
│  ┌──────────────────────────────────────┐        │
│  │   jupyterlab_data_agent.skills       │        │
│  │   ├── risk.py     (IV/KS/WoE/...)    │        │
│  │   └── resource.py (内存评估/环境检查) │        │
│  └──────────────────────────────────────┘        │
└──────────────────────────────────────────────────┘

工程结构

jupyterlab-data-agent/
│
├── src/                              # TypeScript/React 前端
│   ├── index.ts                      # JupyterLab 插件入口，注册侧边栏
│   └── chatPanel.tsx                 # 聊天面板主体：消息渲染/SSE/确认/中断
│                                     #   含 Markdown 解析器、ToolCallDisplay、
│                                     #   ToolResult、SettingsDialogBody 等组件
│
├── style/                            # CSS 样式入口（webpack 打包入口）
│   ├── index.css                     # 样式入口
│   ├── index.js                      # JS 入口 (import base.css)
│   └── base.css                      # 全部 UI 样式（聊天气泡、工具卡片、
│                                     #   表格、代码块、输入区、设置表单等）
│
├── schema/
│   └── plugin.json                   # 用户设置 Schema（baseURL/apiKey/model/temperature）
│
├── scripts/
│   └── build-labextension.cjs        # 本地构建脚本，自动查找 JupyterLab staging 路径
│
├── jupyterlab_data_agent/            # Python 后端包
│   ├── __init__.py                   # Jupyter Server 扩展注册
│   ├── handlers.py                   # 请求处理：StreamingChatHandler / ConfirmHandler
│   │                                 #   含 JSONC 设置解析、Agent 构建
│   ├── agent.py                      # ReAct 智能体核心：工具定义、危险检测、
│   │                                 #   Streaming 事件生成、确认协议
│   └── skills/                       # 预置分析工具集（在隔离内核中导入使用）
│       ├── __init__.py
│       ├── risk.py                   # 信贷风控：IV, KS, Gini, WoE, PSI, Vintage, Scorecard
│       └── resource.py               # 资源评估：内存检测、环境检查、安全加载
│
├── jupyter-config/
│   └── server-config/
│       └── jupyterlab_data_agent.json # Server extension 自动启用配置
│
├── package.json                      # 前端依赖（@jupyterlab/*, react 等）
├── pyproject.toml                    # Python 打包 & 依赖（openai, jupyter_client）
├── tsconfig.json                     # TypeScript 编译配置
└── install.json                      # JupyterLab 扩展发现配置

关键文件职能

文件	职能
`src/index.ts`	注册插件，创建侧边栏 Widget，注册命令/快捷键
`src/chatPanel.tsx`	聊天 UI 全逻辑：SSE 流读取、消息渲染、Markdown 解析、ToolCall/ToolResult 组件、确认对话框、Stop 按钮
`style/base.css`	全局 UI 样式：聊天气泡、代码块、表格、工具卡片、输入区、Grip 拖拽手柄
`agent.py`	智能体核心：ReAct 循环、工具路由、危险操作检测、确认协议（asyncio.Event）
`handlers.py`	HTTP 端点：`/chat-stream`（SSE）、`/confirm`（安全确认）、`/refresh-settings`
`skills/risk.py`	10 个信贷风控函数（IV/KS/Gini/WoE/PSI/Vintage/Scorecard）
`skills/resource.py`	内存评估、环境检查、CSV 安全加载

安装

前提条件： Python ≥ 3.10，JupyterLab ≥ 4.0，Node.js ≥ 16

# 1. 激活目标 conda/venv 环境（可选）
conda activate data-agent

# 2. 安装依赖 + 构建 + 注册扩展
pip install -e .

# 3. （可选）禁用旧的残留扩展以避免启动警告
jupyter server extension disable jupyter_server_fileid
jupyter server extension disable jupyter_server_ydoc
jupyter server extension disable nbclassic

# 4. 启动
jupyter lab

pip install -e . 会自动完成：

jlpm install → 安装前端依赖
jlpm build:lib → TypeScript 编译
jlpm build:labextension:dev → Webpack 打包
注册前端 LabExtension + 后端 Server Extension

使用说明

1. 打开面板

启动 JupyterLab 后，左侧侧边栏自动出现 Data Agent 面板。如果没有，点击菜单 View → Toggle Data Agent Panel。

2. 配置 API

点击面板顶部 Settings 按钮，填写：

API Base URL：接口地址（默认 https://api.openai.com/v1，DeepSeek 用户填写 https://api.deepseek.com/v1）
API Key：你的 API 密钥
Model：模型名称（如 gpt-4o-mini、deepseek-chat）
Temperature：0~2，越高越随机

点击 Save，按钮变为 "Saved ✓" 表示保存成功。

3. 发送问题

在底部输入框输入问题，按 Enter 发送（Shift+Enter 换行）。右下角拖拽手柄可调整输入框高度。

示例问题

探索 data.csv 数据集，显示行列数、数据类型、缺失值和基本统计量

对 application.csv 构建信用评分卡模型，目标变量是 default_flag，
先计算所有特征的 IV 值，再用 WOE 做分箱

检查当前环境有哪些 Python 包可用，然后用现有工具对 loan.csv 做 KS 和 Gini 分析

4. 智能体工作流程

智能体会自动执行以下步骤：

环境检查 → 扫描已安装的数据科学包
数据集评估 → 检查 CSV 文件大小、估算内存占用
逐步执行 → 加载数据 → 探索分析 → 建模 → 可视化 → 生成报告
错误恢复 → 命令失败时自动分析原因并重试

5. 操作确认

当智能体尝试执行危险操作（pip install、rm -rf、git push 等）时，会弹出确认对话框，显示具体操作内容。点击 Confirm 继续或 Deny 跳过。

6. 中断执行

运行过程中，输入框右侧出现红色 Stop 按钮，点击可立即中断当前智能体的思考和执行。

配置

用户设置通过 JupyterLab Settings 系统持久化，存储在：

~/.jupyter/lab/user-settings/jupyterlab-data-agent/plugin.jupyterlab-settings

设置项

键	类型	默认值	说明
`baseURL`	string	`https://api.openai.com/v1`	OpenAI 兼容 API 地址
`apiKey`	string	`""`	API 密钥
`model`	string	`gpt-4o-mini`	模型名称
`temperature`	number	`0.7`	采样温度 (0~2)

兼容的模型提供商

OpenAI：gpt-4o, gpt-4o-mini, gpt-4-turbo 等
DeepSeek：deepseek-chat, deepseek-reasoner
其他：任何 OpenAI 兼容 API（如 Azure OpenAI、本地 vLLM 等）

内置工具与 Skills

智能体工具

工具	参数	说明
`inspect_environment`	无	扫描内核中已安装的数据科学包（25+ 常见包）
`assess_dataset`	`filepath`	评估 CSV 大小、估算内存、给出加载建议
`execute_code`	`code`	在隔离内核中执行 Python 代码
`execute_shell`	`command`, `working_dir?`, `timeout?`	执行 Shell 命令

信贷风控 Skills

在隔离内核中通过 from jupyterlab_data_agent.skills import ... 直接调用：

函数	说明	典型输出
`calculate_iv(df, target, feature)`	单特征 Information Value	分箱表含 WoE + IV
`calculate_iv_all(df, target, features)`	全特征 IV 排序	DataFrame 按 IV 降序
`calculate_ks(df, target, score_col)`	KS 统计量	0~1 标量
`calculate_gini(df, target, score_col)`	Gini 系数 (2*AUC-1)	-1~1 标量
`woe_binning(df, feature, target)`	Weight of Evidence 分箱	分箱表含 WoE
`calculate_psi(expected, actual)`	群体稳定性指数	<0.1 无变化, >0.25 显著偏移
`vintage_analysis(df, origin, perf)`	Vintage 账龄分析	账龄 × 进件队列矩阵
`scorecard_points(iv_table)`	WoE 转评分卡分数	因子+偏移+每箱分数
`default_rate_summary(df, target, groups)`	分组违约率	分组统计表

资源 Skills

函数	说明
`assess_dataset(filepath)`	完整评估：文件大小 → 内存估算 → 推荐策略
`load_csv_safely(filepath)`	智能加载：小文件直读，大文件自动分块
`estimate_csv_memory(filepath)`	估算 CSV 加载后内存占用 (GB)
`get_system_memory_gb()`	获取系统可用内存
`check_file_size(filepath)`	获取文件大小信息
`inspect_environment()`	扫描已安装数据科学包

安全机制

危险操作检测

内置 20+ 危险模式匹配规则，包括：

rm -rf, pip install/uninstall, conda install/uninstall,
apt-get install, npm install -g, git push --force,
DROP TABLE/DATABASE, DELETE FROM, TRUNCATE,
chmod 777, shutdown, reboot, dd if=, mkfs.,
fork bomb 模式

确认流程

智能体规划执行 → 检测危险操作 → 暂停执行
  → 发送 confirm_required SSE 事件
  → 前端弹出确认对话框（显示具体操作内容）
  → 用户点击 Confirm / Deny
  → 结果回传后端 → 继续或跳过

超时 120 秒自动拒绝。

内核隔离

代码执行在独立的 Jupyter Kernel 中，与 JupyterLab Server 主进程完全隔离
内核通过 jupyter_client.KernelManager 创建，运行在独立进程
异步执行通过线程池，不阻塞 Tornado 事件循环

Markdown 渲染

聊天面板支持以下 Markdown 格式的自动渲染：

元素	语法	效果
标题	`# H1` `## H2` `### H3`	多级标题
表格	`	A
粗体	`text`	粗体
斜体	`text`	斜体
代码块	```python ... ```	带语言标签的代码块
行内代码	`code`	灰底等宽字体
无序列表	`- item` `* item`	带缩进列表
有序列表	`1. item`	数字列表
分割线	`---`	水平分割线

开发调试

# 监听模式：自动重编译
jlpm watch

# 手动构建
jlpm build

# 仅构建 TypeScript
node node_modules/typescript/bin/tsc --sourceMap

# 仅构建 LabExtension webpack
node scripts/build-labextension.cjs --development

# 重新安装到 JupyterLab
pip install -e .

# 启动（终端可见 [data-agent] 调试日志）
jupyter lab

# 查看设置文件
cat ~/.jupyter/lab/user-settings/jupyterlab-data-agent/plugin.jupyterlab-settings

调试日志

后端在 stderr 输出 [data-agent] 前缀的调试信息：

Starting isolated kernel... — 内核启动
Loaded settings: [...] — 设置加载结果
Building agent: ... has_api_key=True/False — Agent 初始化
Client disconnected — 用户点击 Stop
异常完整 traceback

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.0

Jun 9, 2026

This version

0.3.0

Jun 7, 2026

0.2.0

Jun 6, 2026

0.1.0

Jun 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jupyterlab_data_agent-0.3.0.tar.gz (151.0 kB view details)

Uploaded Jun 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

jupyterlab_data_agent-0.3.0-py3-none-any.whl (87.6 kB view details)

Uploaded Jun 7, 2026 Python 3

File details

Details for the file jupyterlab_data_agent-0.3.0.tar.gz.

File metadata

Download URL: jupyterlab_data_agent-0.3.0.tar.gz
Upload date: Jun 7, 2026
Size: 151.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for jupyterlab_data_agent-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`1cfe85291a728ac8dd1be3c2ef4c63fc08ff0464cadd0820e73a272f84da271f`
MD5	`acc016ecf53634026df2f0ef2cd777d2`
BLAKE2b-256	`1b5434ad974f1857b2c14718923e458e9d7d551055a8dfc520d2cad6812c2d68`

See more details on using hashes here.

File details

Details for the file jupyterlab_data_agent-0.3.0-py3-none-any.whl.

File metadata

Download URL: jupyterlab_data_agent-0.3.0-py3-none-any.whl
Upload date: Jun 7, 2026
Size: 87.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for jupyterlab_data_agent-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b2d9f864a3aaf93e88c1eafce40dbd6cc59458b46086a64113ce46d973cd149d`
MD5	`16473d6787b164592e6a763da3920e4e`
BLAKE2b-256	`4b57c3fafc4579a3c887a0eb5ca917736787134ea9b5520b7b41c79ccac12866`

See more details on using hashes here.

jupyterlab-data-agent 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

jupyterlab-data-agent

目录

核心特性

架构概览

工程结构

关键文件职能

安装

使用说明

1. 打开面板

2. 配置 API

3. 发送问题

示例问题

4. 智能体工作流程

5. 操作确认

6. 中断执行

配置

设置项

兼容的模型提供商

内置工具与 Skills

智能体工具

信贷风控 Skills

资源 Skills

安全机制

危险操作检测

确认流程

内核隔离

Markdown 渲染

开发调试

调试日志

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes