Skip to main content

databao-agent: NL queries for data

Project description

official project PyPI version Python versions License Open in Colab

Databao Agent

Talk to your data in plain English.
Ask questions → Get answers (Text, SQL, and interactive visual insights).

WebsiteQuickstartLocal modelsContributingDiscord


🏆 Ranked #1 in the DBT track of the Spider 2.0 Text2SQL benchmark


What is Databao Agent?

Databao Agent is an open-source AI agent that lets you query your data sources using natural language.

Simply ask:

  • "Show me all German shows"
  • "Plot revenue by month"
  • "Which customers churned last quarter?"

Get back tables, charts, and explanations — no SQL or code needed.

Databao Agent Demo

Why choose Databao Agent?

Feature What it means for you
Interactive outputs Tables you can sort/filter and charts you can zoom/hover (Vega-Lite)
Simple, Pythonic API thread.ask("question").df()just works
Python-native Fits perfectly into existing data science and exploratory workflows
Natural language Ask questions about your data just like asking a colleague
Broad DB support PostgreSQL, MySQL, SQLite, DuckDB... anything SQLAlchemy supports
Auto-generated charts Get Vega-Lite visualizations without writing plotting code
Local first Use Ollama or LM Studio — your data never leaves your machine
Cloud LLM ready Built-in support for OpenAI, Anthropic, and OpenAI-compatible APIs
Conversational Maintains context for follow-up questions and iterative analysis

Installation

pip install databao-agent

Supported data sources

  • Pandas DataFrame
  • PostgreSQL
  • MySQL
  • SQLite
  • DuckDB

For PostgreSQL, MySQL, and SQLite, pass a SQLAlchemy Engine to add_db(). For DuckDB, pass DuckDBPyConnection.

Quickstart

1. Create a database connection (SQLAlchemy)

import os
from sqlalchemy import create_engine

user = os.environ.get("DATABASE_USER")
password = os.environ.get("DATABASE_PASSWORD")
host = os.environ.get("DATABASE_HOST")
database = os.environ.get("DATABASE_NAME")

engine = create_engine(
   f"postgresql://{user}:{password}@{host}/{database}"
)

2. Create a Databao agent and register sources

import databao.agent as bao

# Option A - Local: install and run any compatible local LLM
# For list of compatible models, see "Local Models" below
# llm_config = bao.LLMConfig(name="ollama:gpt-oss:20b", temperature=0)

# Option B - Cloud (requires an API key, e.g. OPENAI_API_KEY)
llm_config = bao.LLMConfig(name="gpt-4o-mini", temperature=0)

# Add your database to the agent
domain = bao.domain()
domain.add_db(engine)

agent = bao.agent(domain, name="demo", llm_config=llm_config)

3. Ask questions and materialize results

# Start a conversational thread
thread = agent.thread()

# Ask a question and get a DataFrame
df = thread.ask("list all german shows").df()
print(df.head())

# Get a textual answer
print(thread.text())

# Generate a visualization (Vega-Lite under the hood)
plot = thread.plot("bar chart of shows by country")
print(plot.code)  # access generated plot code if needed

Environment variables

Specify your API keys in the environment variables:

Variable Description
OPENAI_API_KEY Required for OpenAI models or OpenAI-compatible APIs
ANTHROPIC_API_KEY Required for Anthropic models

Optional for local/OpenAI-compatible servers:

Variable Description
OPENAI_BASE_URL Custom endpoint (aka api_base_url in code)
OLLAMA_HOST Ollama server address (e.g., 127.0.0.1:11434)

Optional for tracing:

Variable Description
LANGSMITH_TRACING Set to true to enable LangSmith tracing (default: false)
LANGCHAIN_PROJECT LangSmith project name for organizing traces
LANGCHAIN_API_KEY API key from smith.langchain.com

Local Models

Databao agent works great with local LLMs — your data never leaves your machine.

Ollama

  1. Install Ollama for your OS and make sure it’s running

  2. Use a bao.LLMConfig with name of the form "ollama:<model_name>":

    llm_config = bao.LLMConfig(name="ollama:gpt-oss:20b", temperature=0)
    

    The model will be downloaded automatically if it doesn't exist. Or run ollama pull <model_name> to download manually.

OpenAI-compatible servers

You can use any OpenAI-compatible server by setting api_base_url in the bao.LLMConfig.

For an example, see examples/configs/qwen3-8b-oai.yaml.

Compatible servers:

Alternatives

How does Databao agent compare to other agentic data tools?

Tool Open source Local LLMs SQL + DataFrames Multiple sources Interactive output
Databao ✅ Native Ollama ✅ Both ✅ Multiple sources ✅ Tables + charts
PandasAI ✅ Ollama/LM Studio ✅ Both ❌ One source ❌ Static
Chat2DB ✅ Custom LLM, SQL only ❌ One DB ✅ Dashboards
Vanna ✅ Ollama SQL only ❌ One DB ✅ Plotly

Development

Installation (using uv)

Clone this repo and run:

# Install dependencies
uv sync

# Optionally include example extras (notebooks, dotenv)
uv sync --extra examples

We recommend using the same version of uv as GitHub Actions:

uv self update 0.9.5

Makefile targets

# Lint and static checks (pre-commit on all files)
make check

# Run tests (loads .env if present)
make test

Direct commands

uv run pytest -v
uv run pre-commit run --all-files

Tests

The test suite uses pytest. Some tests require API keys and are marked with @pytest.mark.apikey.

# Run all tests
uv run pytest -v

# Run only tests that do NOT require API keys
uv run pytest -v -m "not apikey"

Contributing

We love contributions! Here’s how you can help:

  • Star this repo — it helps others find us!
  • 🐛 Found a bug? Open an issue
  • 💡 Have an idea? We’re all ears — create a feature request
  • 👍 Upvote issues you care about — helps us prioritize
  • 🔧 Submit a PR
  • 📝 Improve docs — typos, examples, tutorials — everything helps!

New to open source? No worries! We’re friendly and happy to help you get started.

License

Apache 2.0 — use it however you want. See the LICENSE file for details.


Like Databao? Give us a ⭐! It will help to distribute the technology.

WebsiteDiscord

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databao_agent-0.2.0.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databao_agent-0.2.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file databao_agent-0.2.0.tar.gz.

File metadata

  • Download URL: databao_agent-0.2.0.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.5

File hashes

Hashes for databao_agent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 9f4e86c9affcae37fd47a4abcc182a4a0e6b2a08a6a4658887fd33b6b08ecfbc
MD5 b6ed9a37f68fe163c303ffb02176d311
BLAKE2b-256 02d44879100915a4b43b567b2522d5a44635de3c8f8cab24a9fb861343cb4ec4

See more details on using hashes here.

File details

Details for the file databao_agent-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for databao_agent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 63417d2b7cffb81d23f27f52fa89f731499c8a53ea72f48e90052ca5a694151a
MD5 86050c9a2f199643a70e14b9d6434657
BLAKE2b-256 4681754413b4d56e7e2f9ab66f626de99f27bba4a7c47f220e50dffb1541f6c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page