AI Agent with dynamic planning and persistent Jupyter kernel execution for data analysis
Project description
DSAgent
An AI-powered autonomous agent for data science with persistent Jupyter kernel execution, session management, and conversational interface.
____ _____ ___ __
/ __ \/ ___/ / | ____ ____ ____ / /_
/ / / /\__ \ / /| |/ __ `/ _ \/ __ \/ __/
/ /_/ /___/ // ___ / /_/ / __/ / / / /_
/_____//____//_/ |_\__, /\___/_/ /_/\__/
/____/
Features
- Conversational Interface: Interactive chat with persistent context and sessions
- Dynamic Planning: Agent creates and follows plans with step tracking
- Persistent Execution: Code runs in a Jupyter kernel with variable persistence across messages
- Session Management: Save and resume conversations with full kernel state
- Multi-Provider LLM: Supports OpenAI, Anthropic, Google, Ollama via LiteLLM
- MCP Tools: Connect to external tools (web search, databases, etc.) via Model Context Protocol
- Human-in-the-Loop: Configurable checkpoints for plan and code approval
- Notebook Generation: Automatically generates clean, runnable Jupyter notebooks
- Agent Skills: Extensible skill system for specialized tasks (EDA, ML, etc.)
Installation
pip install datascience-agent
With optional features:
pip install "datascience-agent[api]" # FastAPI server support
pip install "datascience-agent[mcp]" # MCP tools support
For development:
git clone https://github.com/nmlemus/dsagent
cd dsagent
uv sync --all-extras
Docker
Configuration uses the same environment variables as the CLI and server (see Configuration). The container listens on PORT (default 8000).
# Run API server (default: port 8000)
docker run -d -p 8080:8080 \
-e PORT=8080 \
-e DSAGENT_DEFAULT_MODEL=gpt-4o \
-e OPENAI_API_KEY=sk-your-key \
nmlemus/dsagent:latest
# Run interactive CLI
docker run -it \
-e OPENAI_API_KEY=sk-your-key \
-v "$(pwd)/workspace:/workspace" \
nmlemus/dsagent:latest \
dsagent chat
# One-shot task
docker run --rm \
-e OPENAI_API_KEY=sk-your-key \
-v "$(pwd)/workspace:/workspace" \
nmlemus/dsagent:latest \
dsagent run "Analyze data/sales.csv" --data ./data/sales.csv
For Docker deployment details, see docs/DOCKER.md and docs/guide/docker.md.
Quick Start
1. Setup (First Time)
Run the setup wizard to configure your LLM provider:
dsagent init
This will:
- Ask for your LLM provider (OpenAI, Anthropic, Google, local, etc.)
- Store your API key securely in
~/.dsagent/.env - Automatically select a default model based on provider:
- OpenAI →
gpt-4o - Anthropic →
claude-sonnet-4-5 - Google →
gemini/gemini-2.5-flash - Local →
ollama/llama3
- OpenAI →
- Optionally configure MCP tools (web search, etc.)
To use a different model, set DSAGENT_DEFAULT_MODEL or LLM_MODEL in ~/.dsagent/.env, or use the --model flag:
dsagent --model gpt-4o-mini
2. Start Chatting
dsagent
This starts an interactive session where you can:
- Chat naturally with the agent
- Execute Python code with persistent variables
- Analyze data files
- Generate visualizations
- Resume previous sessions
3. One-Shot Tasks
For batch processing or scripts:
dsagent run "Analyze sales trends" --data ./sales.csv
CLI Commands
| Command | Description |
|---|---|
dsagent |
Start interactive chat (default) |
dsagent chat |
Same as above, with explicit options |
dsagent run "task" |
Execute a one-shot task |
dsagent serve |
Run REST + WebSocket API server |
dsagent init |
Setup wizard for configuration |
dsagent skills list |
List installed skills |
dsagent skills install <source> |
Install a skill from GitHub or path |
dsagent skills remove <name> |
Remove a skill |
dsagent skills info <name> |
Show skill details |
dsagent mcp list |
List configured MCP servers |
dsagent mcp add <template> |
Add an MCP server from template |
dsagent mcp remove <name> |
Remove an MCP server |
Examples
# Interactive chat with specific model
dsagent --model claude-sonnet-4-5
# One-shot analysis
dsagent run "Find patterns in this data" --data ./dataset.csv
# Resume a previous session
dsagent --session abc123
# With MCP tools (web search)
dsagent --mcp-config ~/.dsagent/mcp.yaml
# Human-in-the-loop mode
dsagent --hitl plan
For complete CLI documentation, see docs/CLI.md.
Python API
DSAgent provides two agents for different use cases:
ConversationalAgent (Interactive)
For building chat interfaces and interactive applications:
from dsagent import ConversationalAgent, ConversationalAgentConfig
config = ConversationalAgentConfig(model="gpt-4o")
agent = ConversationalAgent(config)
agent.start()
# Chat with persistent context
response = agent.chat("Load the iris dataset")
print(response.content)
response = agent.chat("Train a classifier on it")
print(response.content) # Has access to previous variables
agent.shutdown()
PlannerAgent (Batch)
For one-shot tasks and automated pipelines:
from dsagent import PlannerAgent
with PlannerAgent(model="gpt-4o", data="./data.csv") as agent:
result = agent.run("Analyze this dataset and create visualizations")
print(result.answer)
print(f"Notebook: {result.notebook_path}")
For complete API documentation, see docs/PYTHON_API.md.
Supported Models
DSAgent uses LiteLLM to support 100+ LLM providers:
| Provider | Models | API Key |
|---|---|---|
| OpenAI | gpt-4o, o1, o3-mini |
OPENAI_API_KEY |
| Anthropic | claude-sonnet-4-5, claude-opus-4 |
ANTHROPIC_API_KEY |
gemini-2.5-pro, gemini-2.5-flash |
GOOGLE_API_KEY |
|
| DeepSeek | deepseek/deepseek-r1 |
DEEPSEEK_API_KEY |
| Ollama | ollama/llama3.2 |
None (local) |
For detailed model setup, see docs/MODELS.md.
MCP Tools
Connect to external tools via the Model Context Protocol:
# Add web search capability
dsagent mcp add brave-search
# Use it in chat
dsagent --mcp-config ~/.dsagent/mcp.yaml
Available templates: brave-search, filesystem, github, memory, fetch, bigquery
For MCP configuration details, see docs/MCP.md.
Session Management
Sessions persist your conversation history and kernel state:
# List sessions
dsagent chat
> /sessions
# Resume a session
dsagent --session <session-id>
# Export session to notebook
> /export myanalysis.ipynb
Output Structure
Each run creates organized output:
workspace/
└── runs/{run_id}/
├── data/ # Input data (copied)
├── notebooks/ # Generated Jupyter notebooks
├── artifacts/ # Charts, models, exports
└── logs/ # Execution logs
Included Libraries
DSAgent comes with essential data science libraries pre-installed:
| Category | Libraries |
|---|---|
| Core | numpy, pandas, scipy |
| DataFrames | polars, pyarrow |
| Visualization | matplotlib, seaborn, plotly |
| Machine Learning | scikit-learn, xgboost, lightgbm, pycaret |
| Feature Selection | boruta |
| Statistics | statsmodels |
Documentation
- CLI Reference - All commands: chat, run, serve, init, mcp, skills
- Configuration - Environment variables and
.env - HTTP API - REST and WebSocket API reference
- Python API - ConversationalAgent and PlannerAgent
- Model Configuration - LLM provider setup
- MCP Tools - External tools integration
- Agent Skills - Extensible skill system
- Docker Guide - Container deployment
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datascience_agent-0.9.1.tar.gz.
File metadata
- Download URL: datascience_agent-0.9.1.tar.gz
- Upload date:
- Size: 583.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36242162591873952e464d6c7bfcd3db3101bbe9eaa47e5aea8b011043cfac33
|
|
| MD5 |
0559af28fea26c747a0b560909869796
|
|
| BLAKE2b-256 |
7af3f928d1fd51ceb5b535fa7b43459a773965af580b758472b6f67a5e3cc038
|
Provenance
The following attestation bundles were made for datascience_agent-0.9.1.tar.gz:
Publisher:
python-publish.yml on nmlemus/dsagent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datascience_agent-0.9.1.tar.gz -
Subject digest:
36242162591873952e464d6c7bfcd3db3101bbe9eaa47e5aea8b011043cfac33 - Sigstore transparency entry: 967437177
- Sigstore integration time:
-
Permalink:
nmlemus/dsagent@ff21778b0cd7a80128b42f0c1abe658ef1c33229 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/nmlemus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ff21778b0cd7a80128b42f0c1abe658ef1c33229 -
Trigger Event:
release
-
Statement type:
File details
Details for the file datascience_agent-0.9.1-py3-none-any.whl.
File metadata
- Download URL: datascience_agent-0.9.1-py3-none-any.whl
- Upload date:
- Size: 190.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a50235fc2f49fad8366236640df2d1b39f6d4d5d5f78ba5f8f7752e2985f902
|
|
| MD5 |
4c18ef8bc750f747e8917fdc3e70a624
|
|
| BLAKE2b-256 |
cb9009cc9bd151e62b228af102bbb3d51fb5e423f0693a6fd21c91b03b9a828b
|
Provenance
The following attestation bundles were made for datascience_agent-0.9.1-py3-none-any.whl:
Publisher:
python-publish.yml on nmlemus/dsagent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datascience_agent-0.9.1-py3-none-any.whl -
Subject digest:
0a50235fc2f49fad8366236640df2d1b39f6d4d5d5f78ba5f8f7752e2985f902 - Sigstore transparency entry: 967437223
- Sigstore integration time:
-
Permalink:
nmlemus/dsagent@ff21778b0cd7a80128b42f0c1abe658ef1c33229 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/nmlemus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ff21778b0cd7a80128b42f0c1abe658ef1c33229 -
Trigger Event:
release
-
Statement type: