Skip to main content

Singer tap for extracting data from Dune Analytics API

Project description

tap-dune

This is a Singer tap that produces JSON-formatted data following the Singer spec.

This tap:

  • Pulls data from the Dune Analytics API
  • Extracts data from specified Dune queries
  • Produces Singer formatted data following the Singer spec
  • Supports incremental replication using query parameters
  • Automatically infers schema from query results

Installation

pip install tap-dune

Configuration

Accepted Config Options

A full list of supported settings and capabilities is available by running:

tap-dune --about

Config File Setup

  1. Copy the example config file:

    cp config.json.example config.json
    
  2. Edit config.json with your settings:

{
    "api_key": "YOUR_DUNE_API_KEY",
    "query_id": "YOUR_QUERY_ID",
    "performance": "medium",
    "query_parameters": [
        {
            "key": "date_from",
            "value": "2025-08-01",
            "replication_key": true
        }
    ]
}

Configuration Fields

Field Required Description
api_key Yes Your Dune Analytics API key
query_id Yes The ID of the Dune query to execute
performance No Query execution performance tier: 'medium' (10 credits) or 'large' (20 credits). Defaults to 'medium'
query_parameters No Array of parameters to pass to your Dune query
schema No Optional: JSON Schema definition of your query's output fields. If not provided, schema will be inferred from query results

Query Parameters

Each query parameter object can have:

  • key: Parameter name in your Dune query
  • value: Parameter value
  • replication_key: Set to true for the parameter that should be used for incremental replication

Schema Configuration

The schema can be:

  1. Automatically inferred from query results (recommended)
  2. Explicitly defined in the config file

When automatically inferring the schema:

  • The tap will execute the query once to get sample data
  • Data types are detected based on the values in the results
  • Special formats like dates and timestamps are automatically recognized
  • Null values are handled by looking at other rows to determine the correct type
  • If a type cannot be determined, it defaults to string

If you need to explicitly define the schema, each field should specify:

  • type: The data type ('string', 'number', 'integer', 'boolean', 'object', 'array')
  • format (optional): Special format for string fields (e.g., 'date', 'date-time')

Example of explicit schema configuration:

{
    "schema": {
        "properties": {
            "day": {"type": "string", "format": "date"},
            "network": {"type": "string"},
            "total_mana": {"type": "number"},
            "total_usd": {"type": "number"}
        }
    }
}

Source Authentication and Authorization

  1. Visit Dune Analytics
  2. Create an account and obtain an API key
  3. Add the API key to your config file

Usage

Basic Usage

  1. Generate a catalog file:

    tap-dune --config config.json --discover > catalog.json
    
  2. Run the tap:

    tap-dune --config config.json --catalog catalog.json
    

Incremental Replication

To use incremental replication:

  1. Mark one of your query parameters with "replication_key": true
  2. Ensure the parameter value is in a format that can be ordered (e.g., dates, timestamps)
  3. The tap will track the last value processed and resume from there in subsequent runs

Pipeline Usage

You can easily run tap-dune in a pipeline using Meltano or any other Singer-compatible tool.

Example with target-jsonl:

tap-dune --config config.json --catalog catalog.json | target-jsonl

Development

Initialize your Development Environment

# Clone the repository
git clone https://github.com/blueprint-data/tap-dune.git
cd tap-dune

# Install Poetry
pipx install poetry

# Install dependencies
poetry install

Development Workflow

This project follows Semantic Versioning and uses Conventional Commits for automatic versioning.

  1. Create a feature branch:

    git checkout -b feat/your-feature
    # or
    git checkout -b fix/your-bugfix
    
  2. Make your changes and commit using conventional commits:

    # For new features
    git commit -m "feat: add new feature X"
    
    # For bug fixes
    git commit -m "fix: resolve issue with Y"
    
    # For breaking changes
    git commit -m "feat: redesign API
    
    BREAKING CHANGE: This changes the API interface"
    

    Commit types:

    • feat: A new feature (minor version bump)
    • fix: A bug fix (patch version bump)
    • docs: Documentation only changes
    • style: Changes that don't affect the code's meaning
    • refactor: Code change that neither fixes a bug nor adds a feature
    • perf: Code change that improves performance
    • test: Adding missing tests
    • chore: Changes to the build process or auxiliary tools
    • BREAKING CHANGE: Any change that breaks backward compatibility (major version bump)
  3. Run tests:

    poetry run pytest
    
  4. Create a pull request to main

Release Process

  1. Create a release branch from main:

    git checkout main
    git pull
    git checkout -b release
    
  2. Push the branch:

    git push -u origin release
    
  3. The release workflow will automatically:

    • Analyze commits since last release
    • Determine the next version number based on commit types:
      • fix: → patch version (1.0.0 → 1.0.1)
      • feat: → minor version (1.0.0 → 1.1.0)
      • BREAKING CHANGE: → major version (1.0.0 → 2.0.0)
    • Update CHANGELOG.md
    • Create a git tag with the new version
    • Create a GitHub release
    • Build and publish to PyPI

    Note: Only commits following the Conventional Commits format will trigger version updates.

  4. After successful release:

    • Create a PR from the release branch to main
    • This PR will contain all the version updates (CHANGELOG.md, version number)
    • Merge to keep main up-to-date with the latest release
    • Note: Only blueprint-data team members can merge to main
  5. Clean up:

    git checkout main
    git pull
    git branch -d release
    

Repository Permissions

This repository follows these security practices:

  • Only blueprint-data team members can merge to main
  • All PRs require at least one review
  • All tests must pass before merging
  • Branch protection rules prevent bypassing these requirements

Testing

poetry run pytest

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tap_dune-0.2.0.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tap_dune-0.2.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file tap_dune-0.2.0.tar.gz.

File metadata

  • Download URL: tap_dune-0.2.0.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.8.16 Darwin/24.6.0

File hashes

Hashes for tap_dune-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b96a2f8af99f5aa6eeadc969313a213bd0c28adebbb7936133b8eb4d9a9f97fb
MD5 74a4a5e5d0f309e477549ccbdd584156
BLAKE2b-256 508b7e3877bcfb786c63938caa2a3a33bd16a8f290e202b246f735cbfaf519f9

See more details on using hashes here.

File details

Details for the file tap_dune-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: tap_dune-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.8.16 Darwin/24.6.0

File hashes

Hashes for tap_dune-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 77772f4c3c2905e7bef03e5d9de0a0751756bc657c9f50217e03d924a619a52b
MD5 ebe0fb1971eff26f2de38890653ed24a
BLAKE2b-256 fc62d34420850b2663fc1ac347ca90d1b141f7f1a84ccac97e1b09b38fe8cf32

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page