Skip to main content

Build and run an AI-powered data science team.

Project description

AI Data Science Team + AI Pipeline Studio
PyPI versions license GitHub Repo stars

AI Data Science Team

AI Data Science Team is a Python library of specialized agents for common data science workflows, plus a flagship app: AI Pipeline Studio. The Studio turns your work into a visual, reproducible pipeline, while the AI team handles data loading, cleaning, visualization, and modeling.

Status: Beta. Breaking changes may occur until 0.1.0.

Please ⭐ us on GitHub (it takes 2 seconds and means a lot).

AI Pipeline Studio (Flagship App)

AI Pipeline Studio is the main example of the AI Data Science Team in action.

AI Pipeline Studio

Highlights:

  • Pipeline-first workspace: Visual Editor, Table, Chart, EDA, Code, Model, Predictions, MLflow
  • Manual + AI steps with lineage and reproducible scripts
  • Multi-dataset handling and merge workflows
  • Project saves: metadata-only or full-data
  • Storage footprint controls and rehydrate workflows

Run it:

streamlit run apps/ai-pipeline-studio-app/app.py

Full app docs: apps/ai-pipeline-studio-app/README.md

Quickstart

Requirements

  • Python 3.10+
  • OpenAI API key (or Ollama for local models)

Install the app and library

Clone the repo and install in editable mode:

pip install -e .

Run the AI Pipeline Studio app

streamlit run apps/ai-pipeline-studio-app/app.py

Library Overview

The repository includes both the AI Pipeline Studio app and the underlying AI Data Science Team library. The library provides agent building blocks and multi-agent workflows for:

  • Data loading and inspection
  • Cleaning, wrangling, and feature engineering
  • Visualization and EDA
  • Modeling and evaluation (H2O + MLflow tools)
  • SQL database interaction

Agents (Snapshot)

Agent examples live in examples/. Notable agents:

  • Data Loader Tools Agent
  • Data Wrangling Agent
  • Data Cleaning Agent
  • Data Visualization Agent
  • EDA Tools Agent
  • Feature Engineering Agent
  • SQL Database Agent
  • H2O ML Agent
  • MLflow Tools Agent
  • Multi-agent workflows (e.g., Pandas Data Analyst, SQL Data Analyst)
  • Supervisor Agent (oversees other agents)
  • Custom tools for data science tasks

Apps

See all apps in apps/. Notable apps:

  • AI Pipeline Studio: apps/ai-pipeline-studio-app/
  • EDA Explorer App: apps/exploratory-copilot-app/
  • Pandas Data Analyst App: apps/pandas-data-analyst-app/

Use OpenAI

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model_name="gpt-4.1-mini",
)

Use Ollama (Local LLM)

ollama serve
ollama pull llama3.1:8b
from langchain_ollama import ChatOllama

llm = ChatOllama(
    model="llama3.1:8b",
)

Next-Gen AI Agentic Workshop

Want to learn how to build AI agents and AI apps for real data science workflows? Join my next‑gen AI workshop: https://learn.business-science.io/ai-register

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_data_science_team-0.0.0.9017.tar.gz (170.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_data_science_team-0.0.0.9017-py3-none-any.whl (189.0 kB view details)

Uploaded Python 3

File details

Details for the file ai_data_science_team-0.0.0.9017.tar.gz.

File metadata

File hashes

Hashes for ai_data_science_team-0.0.0.9017.tar.gz
Algorithm Hash digest
SHA256 f440b7818bab43e986c12282fd2763f6d6606225e7cb5478f2bb312dd088f95d
MD5 7328404081c99f22e43f2ddbead6fa57
BLAKE2b-256 2bbc47b5bc0f57eafdab8ccb383aba1b7837d2e0baa57538a5f5b20b52826187

See more details on using hashes here.

File details

Details for the file ai_data_science_team-0.0.0.9017-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_data_science_team-0.0.0.9017-py3-none-any.whl
Algorithm Hash digest
SHA256 fa4b6fdcec3634248fcd26488fac6b791cbdd0f888c3b2e607451b12e5a192d8
MD5 4c7a75eda2dc4816b3ae3c5021b39447
BLAKE2b-256 b9fc25e14d1bda42fc4915e4d8455961796e5d2bc97e437a63a4cf99be9a770c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page