Skip to main content

🦕 The agent-first database

Project description

Dinobase

🦕 Dinobase

The agent-first database.

Connect your business data. Let AI agents query across all of it.

PyPI - Version License

Docs · Getting Started · Sources


Ask an AI agent: "Which customers that churned last quarter had declining usage AND open support tickets?"

It can't answer — or it gets it wrong. Agents calling per-source tools have no way to JOIN across APIs, no semantic context to interpret field values, and return paginated JSON that fills context windows before producing an answer.

Dinobase gives agents a unified SQL interface with semantic context across all your sources. In benchmarks across 11 LLMs: 91% accuracy vs 35%, 3x faster, 16x cheaper per correct answer.

Quick start

pip install dinobase

1. Connect your data

dinobase add stripe --api-key sk_test_...
dinobase add hubspot --api-key pat-...
dinobase add linear --api-key lin_api_...
dinobase sync

# Or parquet files (no sync needed)
dinobase add parquet --path ./data/events/ --name analytics

# Or databases
dinobase add postgres --connection-string postgresql://...

2. Pick your agent interface

MCP server — for Claude Desktop, Cursor, any MCP client

dinobase install claude-desktop   # Claude Desktop
dinobase install cursor           # Cursor

Writes the correct config file for your platform automatically.

CLI — for Claude Code, Aider, any agent that runs shell

dinobase info
dinobase describe stripe.customers --pretty
dinobase query "SELECT * FROM ..." --pretty

All commands output JSON by default.

3. (Optional) Enable the semantic layer

export ANTHROPIC_API_KEY=sk-ant-...

After every sync, Dinobase automatically runs a Claude agent in the background to annotate your data — table descriptions, column docs, PII flags, and relationship graphs. Agents can then describe any table and get full semantic context: what each column means, which fields are PII, and how to join across tables.

dinobase describe stripe.subscriptions --pretty
# stripe.subscriptions (1,420 rows)
# Description: Active and historical customer subscriptions
#
#   customer_id  VARCHAR  -- References customers.id
#   status       VARCHAR  -- Values: active, past_due, canceled, trialing
#   ...
# Related tables:
#   stripe.customers  (customer_id → id, many_to_one)

Set DINOBASE_AUTO_ANNOTATE=false to disable. See Semantic Layer docs.

4. Ask your agent a cross-source question

"Which companies have closed-won deals over $100K but their subscription is past due?"

The agent writes the SQL, Dinobase executes it across your sources, and the answer comes back in seconds.

Connectors

101 sources across every category. Run dinobase sources --pretty to list all.

Category Sources
CRM & Sales Salesforce, HubSpot, Pipedrive, Attio, Close, Copper
Billing & Payments Stripe, Paddle, Chargebee, Recurly, Lemon Squeezy
Support & Success Zendesk, Intercom, Freshdesk, HelpScout, Customer.io, Vitally, Gainsight
Developer Tools GitHub, GitLab, Jira, Bitbucket, Sentry, Linear
Communication Slack, Discord, Twilio, SendGrid, Mailchimp, Front
E-commerce Shopify, WooCommerce, BigCommerce, Square
Marketing & Analytics Google Analytics, Google Ads, Facebook Ads, HubSpot Marketing, Mixpanel, PostHog, Segment, Plausible, Matomo, Bing Webmaster
HR & Recruiting Personio, BambooHR, Greenhouse, Lever, Workable, Gusto, Deel
Project Management Asana, ClickUp, Monday, Trello, Todoist
Databases Postgres, MySQL, MariaDB, SQL Server, Oracle, SQLite, Snowflake, BigQuery, Redshift, ClickHouse, CockroachDB, Databricks, Trino, Presto, DuckDB, MongoDB
Streaming Kafka, Kinesis
Cloud Storage S3, GCS, Azure Blob, SFTP
Finance QuickBooks, Xero, Brex, Mercury
Productivity Notion, Airtable, Google Sheets
Infrastructure Datadog, New Relic, PagerDuty, OpsGenie, Statuspage, Cloudflare, Vercel, Netlify
Content & CMS Strapi, Contentful, Sanity, WordPress
Design Figma
Video Mux
Files Parquet, CSV (local or S3 — read at query time, no sync needed)

Benchmark

We tested Dinobase SQL against per-source MCP tools across 11 LLMs on 15 RevOps questions (same models, same data, same questions):

Metric Dinobase (SQL) Per-Source MCP
Accuracy 91% 35%
Avg latency 34s 106s
Cost per correct answer $0.027 $0.445

56pp more accurate, 3x faster, 16x cheaper per correct answer — across every model tested.

See benchmarks/ for full results, per-model breakdown, and methodology.

How it works

                    Agent (Claude, GPT, etc.)
                              |
                    +---------+---------+
                    |                   |
               MCP Server             CLI
               (tool calls)       (bash commands)
                    |                   |
                    +---------+---------+
                              |
                        Query Engine
                        (DuckDB SQL)
                              |
                 +------------+------------+
                 |            |            |
            crm.*      billing.*    analytics.*
           (synced)     (synced)    (parquet views)

Each source becomes a schema. Cross-source joins work via shared columns like email. Data stays in parquet — DuckDB is the query engine and metadata store.

Source type How it works Data location
API sources dlt syncs to parquet ~/.dinobase/data/ or cloud storage
File sources DuckDB reads directly via views Your storage — nothing copied

Cloud storage

Store data in S3, GCS, or Azure instead of local disk:

dinobase init --storage s3://my-bucket/dinobase/

Or via environment variable (ideal for containers):

export DINOBASE_STORAGE_URL=s3://my-bucket/dinobase/

Supports Amazon S3, Google Cloud Storage, Azure Blob Storage, and S3-compatible services (MinIO, R2). See Cloud Storage Backend for setup.

Integrations

OpenClaw

openclaw skills install dinobase

Auto-installs Dinobase and teaches your agent to query data via SQL.

Vercel AI SDK

const dinobase = await createMCPClient({
  transport: new Experimental_StdioMCPTransport({
    command: 'dinobase', args: ['serve'],
  }),
});

Native MCP integration. Zero adapter code.

CrewAI

from integrations.crewai.tools import all_tools

agent = Agent(role="Analyst", tools=all_tools)

Python tools wrapping Dinobase's query engine.

LangChain / LangGraph

from integrations.langchain.toolkit import DinobaseToolkit

agent = create_react_agent(model, tools=DinobaseToolkit().get_tools())

LangChain toolkit with LangGraph agent support.

Pydantic AI

from integrations.pydantic_ai.tools import dinobase_agent, DinobaseDeps

result = dinobase_agent.run_sync(question, deps=DinobaseDeps())

Type-safe toolset with dependency injection.

LlamaIndex

from integrations.llamaindex.tool_spec import DinobaseToolSpec

agent = ReActAgent.from_tools(DinobaseToolSpec().to_tool_list(), llm=llm)

BaseToolSpec for ReAct agents.

Mastra

const mcp = new MCPClient({
  id: "dinobase",
  servers: { dinobase: { command: "dinobase", args: ["serve"] } },
});
const agent = new Agent({ tools: await mcp.listTools() });

Native MCP support. Zero adapter code.

Documentation

Development

git clone https://github.com/DinobaseHQ/dinobase
cd dinobase
pip install -e ".[dev]"
pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dinobase-0.2.0.tar.gz (6.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dinobase-0.2.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file dinobase-0.2.0.tar.gz.

File metadata

  • Download URL: dinobase-0.2.0.tar.gz
  • Upload date:
  • Size: 6.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dinobase-0.2.0.tar.gz
Algorithm Hash digest
SHA256 9d9f11f2b64cb8815c0c90fa67a7b1fde1aec4d231e19cd13567e42209c80e1d
MD5 5e09b4f9ed6115aeefe10d0ad331fe30
BLAKE2b-256 de9c23c9b2b3c31a6d5d45167247bdd42082b73bab52460fbcbafa6098f83bba

See more details on using hashes here.

Provenance

The following attestation bundles were made for dinobase-0.2.0.tar.gz:

Publisher: release.yml on DinobaseHQ/dinobase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dinobase-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dinobase-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dinobase-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e2d12db8a1e62bf0ff2c36ebd5082205650e39f2d498e3e0fa69688d65179c90
MD5 65200c4352b7f5b2ae23d73cf62abcf7
BLAKE2b-256 f8b9d25cdd67cfc27af1f702c3b6bb37835b08d67a7a60015464cbf538b75e2a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dinobase-0.2.0-py3-none-any.whl:

Publisher: release.yml on DinobaseHQ/dinobase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page