🦕 The agent-first database
Project description
🦕 Dinobase
The agent-first database.
Connect your business data. Let AI agents query across all of it.
Ask an AI agent: "Which customers that churned last quarter had declining usage AND open support tickets?"
It can't answer — or it gets it wrong. Agents calling per-source tools have no way to JOIN across APIs, no semantic context to interpret field values, and return paginated JSON that fills context windows before producing an answer.
Dinobase gives agents a unified SQL interface with semantic context across all your sources. In benchmarks across 11 LLMs: 91% accuracy vs 35%, 3x faster, 16x cheaper per correct answer.
Quick start
pip install dinobase
1. Connect your data
dinobase add stripe --api-key sk_test_...
dinobase add hubspot --api-key pat-...
dinobase add linear --api-key lin_api_...
dinobase sync
# Or parquet files (no sync needed)
dinobase add parquet --path ./data/events/ --name analytics
# Or databases
dinobase add postgres --connection-string postgresql://...
2. Pick your agent interface
|
MCP server — for Claude Desktop, Cursor, any MCP client dinobase install claude-desktop # Claude Desktop
dinobase install cursor # Cursor
Writes the correct config file for your platform automatically. |
CLI — for Claude Code, Aider, any agent that runs shell dinobase info
dinobase describe stripe.customers --pretty
dinobase query "SELECT * FROM ..." --pretty
All commands output JSON by default. |
3. (Optional) Enable the semantic layer
export ANTHROPIC_API_KEY=sk-ant-...
After every sync, Dinobase automatically runs a Claude agent in the background to annotate your data — table descriptions, column docs, PII flags, and relationship graphs. Agents can then describe any table and get full semantic context: what each column means, which fields are PII, and how to join across tables.
dinobase describe stripe.subscriptions --pretty
# stripe.subscriptions (1,420 rows)
# Description: Active and historical customer subscriptions
#
# customer_id VARCHAR -- References customers.id
# status VARCHAR -- Values: active, past_due, canceled, trialing
# ...
# Related tables:
# stripe.customers (customer_id → id, many_to_one)
Set DINOBASE_AUTO_ANNOTATE=false to disable. See Semantic Layer docs.
4. Ask your agent a cross-source question
"Which companies have closed-won deals over $100K but their subscription is past due?"
The agent writes the SQL, Dinobase executes it across your sources, and the answer comes back in seconds.
Connectors
101 sources across every category. Run dinobase sources --pretty to list all.
| Category | Sources |
|---|---|
| CRM & Sales | Salesforce, HubSpot, Pipedrive, Attio, Close, Copper |
| Billing & Payments | Stripe, Paddle, Chargebee, Recurly, Lemon Squeezy |
| Support & Success | Zendesk, Intercom, Freshdesk, HelpScout, Customer.io, Vitally, Gainsight |
| Developer Tools | GitHub, GitLab, Jira, Bitbucket, Sentry, Linear |
| Communication | Slack, Discord, Twilio, SendGrid, Mailchimp, Front |
| E-commerce | Shopify, WooCommerce, BigCommerce, Square |
| Marketing & Analytics | Google Analytics, Google Ads, Facebook Ads, HubSpot Marketing, Mixpanel, PostHog, Segment, Plausible, Matomo, Bing Webmaster |
| HR & Recruiting | Personio, BambooHR, Greenhouse, Lever, Workable, Gusto, Deel |
| Project Management | Asana, ClickUp, Monday, Trello, Todoist |
| Databases | Postgres, MySQL, MariaDB, SQL Server, Oracle, SQLite, Snowflake, BigQuery, Redshift, ClickHouse, CockroachDB, Databricks, Trino, Presto, DuckDB, MongoDB |
| Streaming | Kafka, Kinesis |
| Cloud Storage | S3, GCS, Azure Blob, SFTP |
| Finance | QuickBooks, Xero, Brex, Mercury |
| Productivity | Notion, Airtable, Google Sheets |
| Infrastructure | Datadog, New Relic, PagerDuty, OpsGenie, Statuspage, Cloudflare, Vercel, Netlify |
| Content & CMS | Strapi, Contentful, Sanity, WordPress |
| Design | Figma |
| Video | Mux |
| Files | Parquet, CSV (local or S3 — read at query time, no sync needed) |
Benchmark
We tested Dinobase SQL against per-source MCP tools across 11 LLMs on 15 RevOps questions (same models, same data, same questions):
| Metric | Dinobase (SQL) | Per-Source MCP |
|---|---|---|
| Accuracy | 91% | 35% |
| Avg latency | 34s | 106s |
| Cost per correct answer | $0.027 | $0.445 |
56pp more accurate, 3x faster, 16x cheaper per correct answer — across every model tested.
See benchmarks/ for full results, per-model breakdown, and methodology.
How it works
Agent (Claude, GPT, etc.)
|
+---------+---------+
| |
MCP Server CLI
(tool calls) (bash commands)
| |
+---------+---------+
|
Query Engine
(DuckDB SQL)
|
+------------+------------+
| | |
crm.* billing.* analytics.*
(synced) (synced) (parquet views)
Each source becomes a schema. Cross-source joins work via shared columns like email. Data stays in parquet — DuckDB is the query engine and metadata store.
| Source type | How it works | Data location |
|---|---|---|
| API sources | dlt syncs to parquet | ~/.dinobase/data/ or cloud storage |
| File sources | DuckDB reads directly via views | Your storage — nothing copied |
Cloud storage
Store data in S3, GCS, or Azure instead of local disk:
dinobase init --storage s3://my-bucket/dinobase/
Or via environment variable (ideal for containers):
export DINOBASE_STORAGE_URL=s3://my-bucket/dinobase/
Supports Amazon S3, Google Cloud Storage, Azure Blob Storage, and S3-compatible services (MinIO, R2). See Cloud Storage Backend for setup.
Integrations
openclaw skills install dinobase
Auto-installs Dinobase and teaches your agent to query data via SQL. |
const dinobase = await createMCPClient({
transport: new Experimental_StdioMCPTransport({
command: 'dinobase', args: ['serve'],
}),
});
Native MCP integration. Zero adapter code. |
from integrations.crewai.tools import all_tools
agent = Agent(role="Analyst", tools=all_tools)
Python tools wrapping Dinobase's query engine. |
from integrations.langchain.toolkit import DinobaseToolkit
agent = create_react_agent(model, tools=DinobaseToolkit().get_tools())
LangChain toolkit with LangGraph agent support. |
from integrations.pydantic_ai.tools import dinobase_agent, DinobaseDeps
result = dinobase_agent.run_sync(question, deps=DinobaseDeps())
Type-safe toolset with dependency injection. |
from integrations.llamaindex.tool_spec import DinobaseToolSpec
agent = ReActAgent.from_tools(DinobaseToolSpec().to_tool_list(), llm=llm)
BaseToolSpec for ReAct agents. |
const mcp = new MCPClient({
id: "dinobase",
servers: { dinobase: { command: "dinobase", args: ["serve"] } },
});
const agent = new Agent({ tools: await mcp.listTools() });
Native MCP support. Zero adapter code. |
Documentation
- Getting Started — Install, connect, query in 5 minutes
- Connecting Sources — Credentials, naming, sync intervals
- Querying Data — Cross-source joins, aggregations, DuckDB SQL
- Mutations — Write data back to sources with preview/confirm flow
- MCP Integration — Agent setup for Claude Desktop, Cursor
- OpenClaw — OpenClaw skill setup
- Vercel AI SDK — MCP integration for Next.js apps
- CrewAI — Python tools for CrewAI agents
- LangChain / LangGraph — Toolkit with LangGraph agent support
- Pydantic AI — Type-safe toolset with dependency injection
- LlamaIndex — BaseToolSpec for ReAct agents
- Mastra — Native MCP integration for TypeScript agents
- Syncing & Scheduling — Daemon mode, per-source intervals, concurrent sync
- Cloud Storage Backend — Store data in S3, GCS, or Azure
- Schema Annotations — How agents understand the data
- CLI Reference — All commands and flags
- MCP Tools Reference — All 7 agent tools
- Architecture — DuckDB, dlt, MCP, module structure
Development
git clone https://github.com/DinobaseHQ/dinobase
cd dinobase
pip install -e ".[dev]"
pytest
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dinobase-0.2.0.tar.gz.
File metadata
- Download URL: dinobase-0.2.0.tar.gz
- Upload date:
- Size: 6.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d9f11f2b64cb8815c0c90fa67a7b1fde1aec4d231e19cd13567e42209c80e1d
|
|
| MD5 |
5e09b4f9ed6115aeefe10d0ad331fe30
|
|
| BLAKE2b-256 |
de9c23c9b2b3c31a6d5d45167247bdd42082b73bab52460fbcbafa6098f83bba
|
Provenance
The following attestation bundles were made for dinobase-0.2.0.tar.gz:
Publisher:
release.yml on DinobaseHQ/dinobase
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dinobase-0.2.0.tar.gz -
Subject digest:
9d9f11f2b64cb8815c0c90fa67a7b1fde1aec4d231e19cd13567e42209c80e1d - Sigstore transparency entry: 1206588858
- Sigstore integration time:
-
Permalink:
DinobaseHQ/dinobase@09a98dbf69775f92121cc7e634fa0fe6d86b9ec7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/DinobaseHQ
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@09a98dbf69775f92121cc7e634fa0fe6d86b9ec7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file dinobase-0.2.0-py3-none-any.whl.
File metadata
- Download URL: dinobase-0.2.0-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2d12db8a1e62bf0ff2c36ebd5082205650e39f2d498e3e0fa69688d65179c90
|
|
| MD5 |
65200c4352b7f5b2ae23d73cf62abcf7
|
|
| BLAKE2b-256 |
f8b9d25cdd67cfc27af1f702c3b6bb37835b08d67a7a60015464cbf538b75e2a
|
Provenance
The following attestation bundles were made for dinobase-0.2.0-py3-none-any.whl:
Publisher:
release.yml on DinobaseHQ/dinobase
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dinobase-0.2.0-py3-none-any.whl -
Subject digest:
e2d12db8a1e62bf0ff2c36ebd5082205650e39f2d498e3e0fa69688d65179c90 - Sigstore transparency entry: 1206588876
- Sigstore integration time:
-
Permalink:
DinobaseHQ/dinobase@09a98dbf69775f92121cc7e634fa0fe6d86b9ec7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/DinobaseHQ
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@09a98dbf69775f92121cc7e634fa0fe6d86b9ec7 -
Trigger Event:
push
-
Statement type: