ReAct Data Science Agent - AI-powered data analysis assistant

These details have not been verified by PyPI

Project links

Project description

Together Open Data Scientist

An AI-powered data analysis assistant that follows the ReAct (Reasoning + Acting) framework to perform comprehensive data science tasks. The agent can execute Python code either locally via Docker or in the cloud using Together Code Interpreter (TCI).

⚠️ Experimental Software Notice

This is an experimental tool powered by large language models. Please be aware of the following limitations:

AI-Generated Code: All analysis and code is generated by AI and may contain errors, bugs, or suboptimal approaches
No Guarantee of Accuracy: Results should be carefully reviewed and validated before making important decisions
Learning Tool: Best suited for exploration, learning, and initial analysis rather than production use
Human Oversight Required: Always verify outputs, especially for critical business or research applications
Evolving Technology: Capabilities and reliability may vary as the underlying models are updated

🚀 Quick Start

Install Together Open Data Scientist using PyPI

pip install open-data-scientist

Run Together Open Data Scientist using command line and TCI

# export together api key
export TOGETHER_API_KEY="your-api-key-here"

# run the agent
open-data-scientist --executor tci --write-report

📖 Example Output

Our Open Data Scientist can perform comprehensive data analysis and generate detailed reports. Below is an example of a complete analysis report for molecular solubility prediction (see the example):

Report Example

Solubility Prediction Report

Analysis Results

🤖 Install from Source

Prerequisites

Python 3.12 or higher
uv - Fast Python package manager
Together AI API key (get one at together.ai)
Docker and Docker Compose (for local execution mode)

Installation

Clone the repository:

cd open-data-scientist

Install the package:

# Install uv (faster alternative to pip)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create and activate virtual environment
uv venv --python=3.12
source .venv/bin/activate
uv pip install -e .

Set up your API key:

export TOGETHER_API_KEY="your-api-key-here"

(Optional, needed when using docker for code execution) Docker Mode Setup

⚠️ Important: Docker mode has session isolation limitations and security considerations for local development. (1) Session isolation: While user variables are isolated between sessions, module modifications and global state changes affect all sessions. (2) Host directory access: The container has read-write access to specific host directories. (3)Best for: Single-user local development and data analysis workflows. For detailed technical information, security warnings, and setup instructions, see the Interpreter README.

launch docker service:

cd interpreter
docker-compose up --build -d

Stop services:
```
docker-compose down
```

Usage

Command Line Interface (CLI): The easiest way to get started is using the command line interface

# Basic usage with local Docker execution
open-data-scientist

# Use cloud execution with TCI
open-data-scientist --executor tci

# Specify a custom model and more iterations
open-data-scientist --model "deepseek-ai/DeepSeek-V3" --iterations 15

# Use specific data directory
open-data-scientist --data-dir /path/to/your/data

# Combine options
open-data-scientist --executor tci --model "meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo" --iterations 20 --data-dir ./my_data

CLI Options

Option	Short	Description	Default
`--model`	`-m`	Language model to use	`deepseek-ai/DeepSeek-V3`
`--iterations`	`-i`	Maximum reasoning iterations	`20`
`--executor`	`-e`	Execution mode: `tci` or `internal`	`internal`
`--data-dir`	`-d`	Data directory to upload	Current directory (with confirmation)
`--session-id`	`-s`	Reuse existing session ID	Auto-generated
`--help`	`-h`	Show help message	-

Python API: For programmatic usage, you can also use the Python API directly

from open_data_scientist.codeagent import ReActDataScienceAgent

# Cloud execution with TCI
agent = ReActDataScienceAgent(
    executor="tci",
    data_dir="path/to/your/data",  # Optional: auto-upload files
    max_iterations=10
)

# Local execution with Docker
agent = ReActDataScienceAgent(
    executor="internal", 
    data_dir="path/to/your/data",  # Optional: auto-upload files
    max_iterations=10
)

result = agent.run("Explore the uploaded CSV files and create summary statistics")

🎯 Execution Modes

The ReAct agent supports two execution modes for running Python code:

Feature	TCI (Together Code Interpreter)	Docker/Internal
Execution Location	☁️ Cloud-based (Together AI)	🏠 Local Docker container
Setup Required	API key only	Docker + docker-compose
File Handling	☁️ Files uploaded to cloud	🏠 Files stay local
Session Persistence	✅ Managed by Together	✅ Local session management
Session Isolation	✅ Independent isolated sessions	⚠️ Limited isolation (see below)
Concurrent Usage	✅ Multiple users/processes safely	⚠️ File conflicts possible
Dependencies	Pre-installed environment	Custom Docker environment
Plot Saving	✅ Can save created plots to disk	❌ Plots not saved to disk

⚠️ Important Privacy Warning

TCI Mode: Using TCI will upload your files to Together AI's cloud servers. Only use this mode if you're comfortable with your data being processed in the cloud.

Docker Mode: All code execution and file processing happens locally in your Docker container. For detailed technical information, security warnings, and setup instructions, see the Interpreter README

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0a2 pre-release

Jun 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_data_scientist-0.1.0a2.tar.gz (11.3 MB view details)

Uploaded Jun 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

open_data_scientist-0.1.0a2-py3-none-any.whl (21.4 kB view details)

Uploaded Jun 12, 2025 Python 3

File details

Details for the file open_data_scientist-0.1.0a2.tar.gz.

File metadata

Download URL: open_data_scientist-0.1.0a2.tar.gz
Upload date: Jun 12, 2025
Size: 11.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for open_data_scientist-0.1.0a2.tar.gz
Algorithm	Hash digest
SHA256	`cca4d763e09e5f0acf5d4f88905a956832a78bc1745fe047f4df05ee6e7f6b73`
MD5	`71620571a968c4e3b5797502673dfba5`
BLAKE2b-256	`26d3f3e4814d056e6b942e1a6243aa6573e59244677aa1771339023a759a3ebd`

See more details on using hashes here.

File details

Details for the file open_data_scientist-0.1.0a2-py3-none-any.whl.

File metadata

Download URL: open_data_scientist-0.1.0a2-py3-none-any.whl
Upload date: Jun 12, 2025
Size: 21.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for open_data_scientist-0.1.0a2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a08923ff3125591e215008e410772588d7de024b9e7c6c25a04a3318c72c3799`
MD5	`e87fce7967c449d7ea927c51351148f9`
BLAKE2b-256	`8ef0d754c8ddc1f2c48f1c41c820327f227022382dc5803333b89d5efc88a560`

See more details on using hashes here.

open-data-scientist 0.1.0a2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Together Open Data Scientist

⚠️ Experimental Software Notice

🚀 Quick Start

Install Together Open Data Scientist using PyPI

Run Together Open Data Scientist using command line and TCI

📖 Example Output

Report Example

🤖 Install from Source

Prerequisites

Installation

Clone the repository:

Install the package:

Set up your API key:

(Optional, needed when using docker for code execution) Docker Mode Setup

Usage

🎯 Execution Modes

⚠️ Important Privacy Warning

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes