🦕 The agent-first database

These details have not been verified by PyPI

Project description

🦕 Dinobase

The agent-first database.

Connect your business data. Let AI agents query across all of it.

Ask an AI agent: "Which customers that churned last quarter had declining usage AND open support tickets?"

It can't answer — or it gets it wrong. Agents calling per-source tools have no way to JOIN across APIs, no semantic context to interpret field values, and return paginated JSON that fills context windows before producing an answer.

Dinobase gives agents a unified SQL interface with semantic context across all your sources. Agents can read across all sources with a single SQL query, and write data back (reverse ETL) via SQL mutations with a preview/confirm flow. In benchmarks across 11 LLMs: 91% accuracy vs 35%, 3x faster, 16x cheaper per correct answer.

Quick start

# recommended — installs everything automatically
curl -fsSL https://dinobase.ai/install.sh | bash

# or with uv
uv tool install dinobase

# or with pip
pip install dinobase

1. Connect your data

dinobase add stripe --api-key sk_test_...
dinobase add hubspot --api-key pat-...
dinobase add linear --api-key lin_api_...
dinobase sync

# Or parquet files (no sync needed)
dinobase add parquet --path ./data/events/ --name analytics

# Or databases
dinobase add postgres --connection-string postgresql://...

2. Pick your agent interface

CLI — for Claude Code, Cursor, Codex, Aider, any agent that runs shell

dinobase install claude-code   # Claude Code (~/.claude/CLAUDE.md)
dinobase install cursor        # Cursor (./AGENTS.md)
dinobase install codex         # Codex (~/.codex/AGENTS.md)

Writes usage instructions to the tool's instructions file. Agents run dinobase info, dinobase describe, and dinobase query directly.

MCP server — for Claude Desktop, any MCP client

dinobase install claude-desktop   # Claude Desktop (writes config automatically)
dinobase serve                    # any other MCP client

dinobase serve starts the MCP server on stdio. Run dinobase mcp-config <client> to get the JSON snippet to paste into your client's config.

3. Ask your agent a cross-source question

"Which companies have closed-won deals over $100K but their subscription is past due?"

The agent writes the SQL, Dinobase executes it across your sources, and the answer comes back in seconds.

4. Write data back (reverse ETL)

Agents can also mutate source data via SQL. Every mutation goes through a preview/confirm flow — nothing executes until confirmed.

dinobase query "UPDATE stripe.customers SET name = 'Acme Inc' WHERE id = 'cus_123'"
# Returns a preview: 1 row affected, will call Stripe API

dinobase confirm <mutation_id>
# ✓ Stripe API called (1/1 succeeded)
# ✓ Data updated

5. (Optional) Enable the semantic layer

export ANTHROPIC_API_KEY=sk-ant-...

After every sync, Dinobase automatically runs a Claude agent in the background to annotate your data — table descriptions, column docs, PII flags, and relationship graphs. Agents can then describe any table and get full semantic context.

dinobase describe stripe.subscriptions --pretty
# stripe.subscriptions (1,420 rows)
# Description: Active and historical customer subscriptions
#
#   customer_id  VARCHAR  -- References customers.id
#   status       VARCHAR  -- Values: active, past_due, canceled, trialing
#   ...
# Related tables:
#   stripe.customers  (customer_id → id, many_to_one)

Set DINOBASE_AUTO_ANNOTATE=false to disable. See Semantic Layer docs.

Benchmark

We tested Dinobase SQL against per-source MCP tools across 11 LLMs on 15 RevOps questions (same models, same data, same questions):

Metric	Dinobase (SQL)	Per-Source MCP
Accuracy	91%	35%
Avg latency	34s	106s
Cost per correct answer	$0.027	$0.445

56pp more accurate, 3x faster, 16x cheaper per correct answer — across every model tested.

See benchmarks/ for full results, per-model breakdown, and methodology.

Connectors

101 sources across every category. Run dinobase sources --available --pretty to list all.

Category	Sources
CRM & Sales	Salesforce, HubSpot, Pipedrive, Attio, Close, Copper
Billing & Payments	Stripe, Paddle, Chargebee, Recurly, Lemon Squeezy
Support & Success	Zendesk, Intercom, Freshdesk, HelpScout, Customer.io, Vitally, Gainsight
Developer Tools	GitHub, GitLab, Jira, Bitbucket, Sentry, Linear
Communication	Slack, Discord, Twilio, SendGrid, Mailchimp, Front
E-commerce	Shopify, WooCommerce, BigCommerce, Square
Marketing & Analytics	Google Analytics, Google Ads, Facebook Ads, HubSpot Marketing, Mixpanel, PostHog, Segment, Plausible, Matomo, Bing Webmaster
HR & Recruiting	Personio, BambooHR, Greenhouse, Lever, Workable, Gusto, Deel
Project Management	Asana, ClickUp, Monday, Trello, Todoist
Databases	Postgres, MySQL, MariaDB, SQL Server, Oracle, SQLite, Snowflake, BigQuery, Redshift, ClickHouse, CockroachDB, Databricks, Trino, Presto, DuckDB, MongoDB
Streaming	Kafka, Kinesis
Cloud Storage	S3, GCS, Azure Blob, SFTP
Finance	QuickBooks, Xero, Brex, Mercury
Productivity	Notion, Airtable, Google Sheets
Infrastructure	Datadog, New Relic, PagerDuty, OpsGenie, Statuspage, Cloudflare, Vercel, Netlify
Content & CMS	Strapi, Contentful, Sanity, WordPress
Design & Video	Figma, Mux
Files	Parquet, CSV (local or S3 — read at query time, no sync needed)

How it works

                Agent (Claude, GPT, etc.)
                          |
                +---------+---------+
                |                   |
           MCP Server             CLI
           (tool calls)       (bash commands)
                |                   |
                +---------+---------+
                          |
                    Query Engine
                    (DuckDB SQL)
                          |
             +------------+------------+
             |            |            |
        crm.*      billing.*    analytics.*
       (synced)     (synced)    (parquet views)

Each source becomes a schema. Cross-source joins work via shared columns like email. Data stays in parquet — DuckDB is the query engine and metadata store.

API sources sync to parquet in ~/.dinobase/data/ (or cloud storage). File sources are read directly via DuckDB views — nothing is copied.

Cloud storage

Store data in S3, GCS, or Azure instead of local disk:

dinobase init --storage s3://my-bucket/dinobase/
# or
export DINOBASE_STORAGE_URL=s3://my-bucket/dinobase/

Supports Amazon S3, Google Cloud Storage, Azure Blob Storage, and S3-compatible services (MinIO, R2). See Cloud Storage docs.

Integrations

Works with every major agent framework: CrewAI · LangChain / LangGraph · LlamaIndex · Pydantic AI · Vercel AI SDK · Mastra · OpenClaw

Documentation

Getting Started — Install, connect, and query in 5 minutes
Connecting Sources — Credentials, naming, sync intervals
Querying Data — Cross-source joins, aggregations, DuckDB SQL
Reverse ETL (Mutations) — Write data back to source APIs
MCP Integration — Agent setup for Claude Desktop, Cursor
Cloud Storage Backend — Store data in S3, GCS, or Azure
Schema Annotations — How agents understand the data
CLI Reference — All commands and flags
Architecture — DuckDB, dlt, MCP, module structure

Development

git clone https://github.com/DinobaseHQ/dinobase
pip install -e ".[dev]"
pytest

License

MIT Expat

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.9

Apr 30, 2026

0.2.8

Apr 30, 2026

0.2.7

Apr 29, 2026

0.2.6

Apr 15, 2026

0.2.5.1

Apr 7, 2026

This version

0.2.5

Apr 6, 2026

0.2.4

Apr 5, 2026

0.2.3

Apr 5, 2026

0.2.2

Apr 5, 2026

0.2.1

Apr 1, 2026

0.2.0

Apr 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dinobase-0.2.5.tar.gz (22.8 MB view details)

Uploaded Apr 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dinobase-0.2.5-py3-none-any.whl (259.3 kB view details)

Uploaded Apr 6, 2026 Python 3

File details

Details for the file dinobase-0.2.5.tar.gz.

File metadata

Download URL: dinobase-0.2.5.tar.gz
Upload date: Apr 6, 2026
Size: 22.8 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dinobase-0.2.5.tar.gz
Algorithm	Hash digest
SHA256	`d5ac8deaf4070524c3bd06e3dc7e06f2c346dd0ecc5d7479d080cdacc42b243d`
MD5	`84a7a5bd27d6c34986bc66d5d1c6aeb4`
BLAKE2b-256	`2c1b0b3b99036d65e32e07b4b6908e23eca90f54c212e7654a0bbd551a0cea45`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dinobase-0.2.5.tar.gz:

Publisher: release.yml on DinobaseHQ/dinobase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dinobase-0.2.5.tar.gz
- Subject digest: d5ac8deaf4070524c3bd06e3dc7e06f2c346dd0ecc5d7479d080cdacc42b243d
- Sigstore transparency entry: 1244505796
- Sigstore integration time: Apr 6, 2026
Source repository:
- Permalink: DinobaseHQ/dinobase@cea2e43caca9a70ad46ff1c4653ec9375760681e
- Branch / Tag: refs/tags/v0.2.5
- Owner: https://github.com/DinobaseHQ
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@cea2e43caca9a70ad46ff1c4653ec9375760681e
- Trigger Event: push

File details

Details for the file dinobase-0.2.5-py3-none-any.whl.

File metadata

Download URL: dinobase-0.2.5-py3-none-any.whl
Upload date: Apr 6, 2026
Size: 259.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dinobase-0.2.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0541bf72a152471d9a739973316d5461657c5d48e3c364e7d0c1db85faa13581`
MD5	`3654d11b7a4c17cd40056426ebd96a8b`
BLAKE2b-256	`83c0953d386282e991f34734cf4d9b16dca1dfee6d754ee2491032a3ce5ed9b1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dinobase-0.2.5-py3-none-any.whl:

Publisher: release.yml on DinobaseHQ/dinobase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dinobase-0.2.5-py3-none-any.whl
- Subject digest: 0541bf72a152471d9a739973316d5461657c5d48e3c364e7d0c1db85faa13581
- Sigstore transparency entry: 1244505853
- Sigstore integration time: Apr 6, 2026
Source repository:
- Permalink: DinobaseHQ/dinobase@cea2e43caca9a70ad46ff1c4653ec9375760681e
- Branch / Tag: refs/tags/v0.2.5
- Owner: https://github.com/DinobaseHQ
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@cea2e43caca9a70ad46ff1c4653ec9375760681e
- Trigger Event: push

dinobase 0.2.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

🦕 Dinobase

Quick start

1. Connect your data

2. Pick your agent interface

3. Ask your agent a cross-source question

4. Write data back (reverse ETL)

5. (Optional) Enable the semantic layer

Benchmark

Connectors

How it works

Cloud storage

Integrations

Documentation

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance