Skip to main content

AI-powered SQL Agent for data engineering (Compiled Version)

Project description

Apache 2.0 License Website Document Quick Start Release Note Join our Slack

🎯 Overview

Datus is an open-source data engineering agent that builds evolvable context for your data system.

Data engineering needs a shift from "building tables and pipelines" to "delivering scoped, domain-aware agents for analysts and business users.

DatusArchitecure

  • Datus-CLI: An AI-powered command-line interface for data engineers—think "Claude Code for data engineers." Write SQL, build subagents, and construct context interactively.
  • Datus-Chat: A web chatbot providing multi-turn conversations with built-in feedback mechanisms (upvotes, issue reports, success stories) for data analysts.
  • Datus-API: APIs for other agents or applications that need stable, accurate data services.

🚀 Key Features

🧩 Contextual Data Engineering

Automatically builds a living semantic map of your company’s data — combining metadata, metrics, SQL history, and external knowledge — so engineers and analysts collaborate through context instead of raw SQL.

💬 Agentic Chat

A Claude-Code-like CLI for data engineers.
Chat with your data, recall tables or metrics instantly, and run agentic actions — all in one terminal.

🧠 Subagents for Every Domain

Turn data domains into domain-aware chatbots.
Each subagent encapsulates the right context, tools, and rules — making data access accurate, reusable, and safe.

🔁 Continuous Learning Loop

Every query and feedback improves the model.
Datus learns from success stories and user corrections to evolve reasoning accuracy over time.


🧰 Installation

Requirements: Python >= 3.12

pip install datus-agent==0.2.1

datus-agent init

For detailed installation instructions, see the Quickstart Guide.

🧭 User Journey

1️⃣ Initial Exploration

A Data Engineer (DE) starts by chatting with the database using /chat. They run simple questions, test joins, and refine prompts using @table or @file. Each round of feedback (e.g., "Join table1 and table2 by PK") helps the model improve accuracy. datus-cli --namespace demo /Check the top 10 bank by assets lost @Table duckdb-demo.main.bank_failures

Learn more: CLI Introduction

2️⃣ Building Context

The DE imports SQL history and generates summaries or semantic models:

/gen_semantic_model xxx @subject They edit or refine models in @subject, combining AI-generated drafts with human corrections. Now, /chat can reason using both SQL history and semantic context.

Learn more: Knowledge Base Introduction

3️⃣ Creating a Subagent

When the context matures, the DE defines a domain-specific chatbot (Subagent):

.subagent add mychatbot

They describe its purpose, add rules, choose tools, and limit scope (e.g., 5 tables). Each subagent becomes a reusable, scoped assistant for a specific business area.

Learn more: Subagent Introduction

4️⃣ Delivering to Analysts

The Subagent is deployed to a web interface: http://localhost:8501/?subagent=mychatbot

Analysts chat directly, upvote correct answers, or report issues for feedback. Results can be saved via !export.

Learn more: Web Chatbot Introduction

5️⃣ Refinement & Iteration

Feedback from analysts loops back to improve the subagent: engineers fix SQL, add rules, and update context. Over time, the chatbot becomes more accurate, self-evolving, and domain-aware.

For detailed guidance, please follow our tutorial.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datus_agent-0.2.2rc4.tar.gz (3.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datus_agent-0.2.2rc4-py3-none-any.whl (3.5 MB view details)

Uploaded Python 3

File details

Details for the file datus_agent-0.2.2rc4.tar.gz.

File metadata

  • Download URL: datus_agent-0.2.2rc4.tar.gz
  • Upload date:
  • Size: 3.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for datus_agent-0.2.2rc4.tar.gz
Algorithm Hash digest
SHA256 66c5cd5fb27e039f087d4ec7194120268234dd3bc69b10c82466f31db8453b68
MD5 13f18391117345a20b2927ad45788a49
BLAKE2b-256 46524df8dde316451b1c02f0d79a7e522c16ce8539319dd68fecbb6387c21e58

See more details on using hashes here.

File details

Details for the file datus_agent-0.2.2rc4-py3-none-any.whl.

File metadata

  • Download URL: datus_agent-0.2.2rc4-py3-none-any.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for datus_agent-0.2.2rc4-py3-none-any.whl
Algorithm Hash digest
SHA256 680a11f164ffdf21699cb5a0e7b4b0e9783c55d8b1d6c403be10776fb9f63646
MD5 44f25d4d2d628f7bba698096b6c9a882
BLAKE2b-256 a4c9c98ef01c4612fbf5f58f3933c378ca37be60f1c7e2aecd88a05a791f66e9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page