Singer tap for extracting data from Dune Analytics API

These details have not been verified by PyPI

Project links

Project description

tap-dune

This is a Singer tap that produces JSON-formatted data following the Singer spec.

This tap:

Pulls data from the Dune Analytics API
Extracts data from specified Dune queries
Produces Singer formatted data following the Singer spec
Supports incremental replication using query parameters
Automatically infers schema from query results
Advertises configurable primary keys for correct upsert/dedup behavior in targets

Installation

pip install tap-dune

Configuration

Accepted Config Options

A full list of supported settings and capabilities is available by running:

tap-dune --about

Config File Setup

Copy the example config file:
```
cp config.json.example config.json
```
Edit config.json with your settings:

{
    "api_key": "YOUR_DUNE_API_KEY",
    "query_id": "YOUR_QUERY_ID",
    "performance": "medium",
    "query_parameters": [
        {
            "key": "date_from",
            "value": "2025-08-01",
            "type": "date",
            "replication_key": true,
            "replication_key_field": "day"
        }
    ]
}

Configuration Fields

Field	Required	Description
`api_key`	Yes	Your Dune Analytics API key
`query_id`	Yes	The ID of the Dune query to execute
`performance`	No	Query execution performance tier: 'medium' (10 credits) or 'large' (20 credits). Defaults to 'medium'
`query_parameters`	No	Array of parameters to pass to your Dune query
`schema`	No	Optional: JSON Schema definition of your query's output fields. If not provided, schema will be inferred from query results
`primary_keys`	No	Array of field names that uniquely identify each record. Used by targets for upsert/dedup

Query Parameters

Each query parameter object can have:

key: Parameter name in your Dune query
value: Parameter value
replication_key: Set to true for the parameter that should be used for incremental replication
replication_key_field: The field in the query results to use for tracking replication state (required if replication_key is true)
type: The data type of the parameter value. Can be one of:
- string (default)
- integer
- number
- date
- date-time

Schema Configuration

The schema can be:

Automatically inferred from query results (recommended)
Explicitly defined in the config file

When automatically inferring the schema:

The tap will execute the query once to get sample data
Data types are detected based on the values in the results
Special formats like dates and timestamps are automatically recognized
Null values are handled by looking at other rows to determine the correct type
If a type cannot be determined, it defaults to string

If you need to explicitly define the schema, each field should specify:

type: The data type ('string', 'number', 'integer', 'boolean', 'object', 'array')
format (optional): Special format for string fields (e.g., 'date', 'date-time')

When using incremental replication, the schema configuration is particularly important for the replication key field:

The field's type in the schema determines how values are compared for incremental replication
You can specify any type that supports ordering (string, number, integer)
For date/time fields, you can add the appropriate format ('date' or 'date-time')

Examples of query parameter configurations with different replication key types:

Date-based replication (most common):

{
    "api_key": "YOUR_DUNE_API_KEY",
    "query_id": "YOUR_QUERY_ID",
    "primary_keys": ["date", "source"],
    "query_parameters": [
        {
            "key": "start_date",
            "value": "2025-08-01",
            "type": "date",
            "replication_key": true
        }
    ]
}

Numeric replication (e.g., for block numbers):

{
    "api_key": "YOUR_DUNE_API_KEY",
    "query_id": "YOUR_QUERY_ID",
    "query_parameters": [
        {
            "key": "min_block",
            "value": "1000000",
            "type": "integer",
            "replication_key": true
        }
    ]
}

Timestamp replication:

{
    "api_key": "YOUR_DUNE_API_KEY",
    "query_id": "YOUR_QUERY_ID",
    "query_parameters": [
        {
            "key": "start_time",
            "value": "2025-08-01T00:00:00Z",
            "type": "date-time",
            "replication_key": true
        }
    ]
}

Source Authentication and Authorization

Visit Dune Analytics
Create an account and obtain an API key
Add the API key to your config file

Usage

Basic Usage

Generate a catalog file:

tap-dune --config config.json --discover > catalog.json

Run the tap:

tap-dune --config config.json --catalog catalog.json

Incremental Replication

To use incremental replication:

Mark one of your query parameters with "replication_key": true
Ensure the parameter value is in a format that can be ordered (e.g., dates, timestamps, numbers)
The tap will track the last value processed and resume from there in subsequent runs

When using incremental replication, you need to configure:

The query parameter that will be used for filtering (replication_key: true)
The field in the query results that will be used for state tracking (replication_key_field)
The data type of the parameter (type)

For example, if your query:

Takes a date_from parameter for filtering
Returns records with a day field containing dates
You want to use that day field for tracking progress

Your configuration would look like:

{
    "query_parameters": [
        {
            "key": "date_from",
            "value": "2025-08-01",
            "type": "date",
            "replication_key": true,
            "replication_key_field": "day"
        }
    ]
}

The tap will:

Use date_from to filter the query results
Track the day field values from the results
Use those values to set date_from in subsequent runs

The parameter type can be:

date or date-time for date-based parameters
integer or number for numeric parameters
string (default) for text parameters

Pipeline Usage

You can easily run tap-dune in a pipeline using Meltano or any other Singer-compatible tool.

Example with target-jsonl:

tap-dune --config config.json --catalog catalog.json | target-jsonl

When loading to a database target that performs upserts (e.g., Snowflake):

Set primary_keys in the tap config to the fields that uniquely identify a row in your query output (e.g., ["date", "source"]).
Ensure your loader configuration (e.g., PipelineWise or Meltano target) uses the same primary keys for merge/upsert.
For append-only behavior, leave primary_keys empty and configure your loader for pure inserts.

Development

Initialize your Development Environment

# Clone the repository
git clone https://github.com/blueprint-data/tap-dune.git
cd tap-dune

# Install Poetry
pipx install poetry

# Install dependencies
poetry install

Development Workflow

This project follows Semantic Versioning and uses Conventional Commits for automatic versioning.

Create a feature branch:

git checkout -b feat/your-feature
# or
git checkout -b fix/your-bugfix

Make your changes and commit using conventional commits:
```
# For new features
git commit -m "feat: add new feature X"

# For bug fixes
git commit -m "fix: resolve issue with Y"

# For breaking changes
git commit -m "feat: redesign API

BREAKING CHANGE: This changes the API interface"
```
Commit types:
- feat: A new feature (minor version bump)
- fix: A bug fix (patch version bump)
- docs: Documentation only changes
- style: Changes that don't affect the code's meaning
- refactor: Code change that neither fixes a bug nor adds a feature
- perf: Code change that improves performance
- test: Adding missing tests
- chore: Changes to the build process or auxiliary tools
- BREAKING CHANGE: Any change that breaks backward compatibility (major version bump)
Run tests:
```
poetry run pytest
```
Create a pull request to main

Release Process

Create a release branch from main:

git checkout main
git pull
git checkout -b release

Push the branch:
```
git push -u origin release
```
The release workflow will automatically:
- Analyze commits since last release
- Determine the next version number based on commit types:
  - fix: → patch version (1.0.0 → 1.0.1)
  - feat: → minor version (1.0.0 → 1.1.0)
  - BREAKING CHANGE: → major version (1.0.0 → 2.0.0)
- Update CHANGELOG.md
- Create a git tag with the new version
- Create a GitHub release
- Build and publish to PyPI
Note: Only commits following the Conventional Commits format will trigger version updates.
After successful release:
- Create a PR from the release branch to main
- This PR will contain all the version updates (CHANGELOG.md, version number)
- Merge to keep main up-to-date with the latest release
- Note: Only blueprint-data team members can merge to main

Clean up:

git checkout main
git pull
git branch -d release

Repository Permissions

This repository follows these security practices:

Only blueprint-data team members can merge to main
All PRs require at least one review
All tests must pass before merging
Branch protection rules prevent bypassing these requirements

Testing

poetry run pytest

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0

Aug 8, 2025

This version

0.4.0

Aug 8, 2025

0.3.0

Aug 7, 2025

0.2.0

Aug 7, 2025

0.1.0

Aug 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tap_dune-0.4.0.tar.gz (14.5 kB view details)

Uploaded Aug 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tap_dune-0.4.0-py3-none-any.whl (12.9 kB view details)

Uploaded Aug 8, 2025 Python 3

File details

Details for the file tap_dune-0.4.0.tar.gz.

File metadata

Download URL: tap_dune-0.4.0.tar.gz
Upload date: Aug 8, 2025
Size: 14.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.8.16 Darwin/24.6.0

File hashes

Hashes for tap_dune-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`8956ff7d4fb600a2a30ff7f8762fa2cb498bf5b43c7df66a916afba94e48594c`
MD5	`8490cfcb4bcbd763c86d5b51fe829e29`
BLAKE2b-256	`c7815b26cd60b34632daafa6dfd6231a156da0da29b2646055b0e6d794984238`

See more details on using hashes here.

File details

Details for the file tap_dune-0.4.0-py3-none-any.whl.

File metadata

Download URL: tap_dune-0.4.0-py3-none-any.whl
Upload date: Aug 8, 2025
Size: 12.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.8.16 Darwin/24.6.0

File hashes

Hashes for tap_dune-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b8d69d3747817d02335054b29c60e507dc097e8b75544e9d7325a7b8fc9e788d`
MD5	`29bf9574ba169e21c601ea700d818357`
BLAKE2b-256	`cef164cf9af2ba3450b1cd246c9a5773dcc72c6642f98d606472e62d515c80d9`

See more details on using hashes here.

tap-dune 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

tap-dune

Installation

Configuration

Accepted Config Options

Config File Setup

Configuration Fields

Query Parameters

Schema Configuration

Source Authentication and Authorization

Usage

Basic Usage

Incremental Replication

Pipeline Usage

Development

Initialize your Development Environment

Development Workflow

Release Process

Repository Permissions

Testing

SDK Dev Guide

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes