Skip to main content

A dbt adapter that automatically optimizes BigQuery job costs using the Rabbit API

Project description

dbt-rabbit-bigquery

PyPI version Python versions License dbt-core

Automatically optimize your BigQuery costs in dbt without changing a single line of SQL.

The dbt-rabbit-bigquery adapter is a drop-in replacement for dbt-bigquery that intelligently routes your queries to the most cost-effective BigQuery resources using the Rabbit optimization platform. Save up to 60% on BigQuery costs while maintaining full compatibility with your existing dbt projects.

🎯 Why Use This Adapter?

The Problem

BigQuery offers multiple pricing options (on-demand, flat-rate slots, reservations), but choosing the right option for each query is complex and time-consuming. Most teams either:

  • Overpay by using on-demand pricing for everything
  • Underutilize expensive slot commitments
  • Spend engineering time manually optimizing queries

The Solution

This adapter automatically analyzes each query and assigns it to the optimal BigQuery pricing model, ensuring you always get the best performance at the lowest cost—without any code changes.

Key Benefits

  • Zero Code Changes: Drop-in replacement for dbt-bigquery
  • 💰 Automatic Cost Optimization: Save up to 60% on BigQuery costs
  • 🚀 No Performance Impact: Sub-second API overhead
  • 🛡️ Production Ready: Graceful fallback if optimization fails
  • 📊 Full Transparency: Detailed logging and cost trackin

📦 Installation

Important: You must install the version that matches your dbt-bigquery version.

Step 1: Check Your dbt-bigquery Version

pip show dbt-bigquery
# Look for: Version: 1.8.3 (or 1.9.2, 1.10.3, etc.)

Step 2: Install the Matching Adapter Version

The adapter version format is {base_version}.{dbt-bigquery_version}. Install the version that matches your dbt-bigquery:

# For dbt-bigquery 1.8.3
pip install dbt-rabbit-bigquery==1.1.0.1.8.3

# For dbt-bigquery 1.9.2
pip install dbt-rabbit-bigquery==1.1.0.1.9.2

# For dbt-bigquery 1.10.3
pip install dbt-rabbit-bigquery==1.1.0.1.10.3

Installation in Requirements Files

requirements.txt:

dbt-bigquery==1.8.3
dbt-rabbit-bigquery==1.1.0.1.8.3

pyproject.toml:

[project]
dependencies = [
    "dbt-bigquery==1.8.3",
    "dbt-rabbit-bigquery==1.1.0.1.8.3",
]

Poetry (pyproject.toml):

[tool.poetry.dependencies]
dbt-bigquery = "1.8.3"
dbt-rabbit-bigquery = "1.1.0.1.8.3"

Supported Versions

dbt-bigquery dbt-rabbit-bigquery Status
1.8.3 1.1.0.1.8.3 ✅ Supported
1.9.2 1.1.0.1.9.2 ✅ Supported
1.10.3 1.1.0.1.10.3 ✅ Supported

Note: Always use the exact version match. The adapter version must match your dbt-bigquery version for compatibility.

Why This Versioning?

The adapter uses a hybrid versioning approach:

  • PyPI version (1.1.0.1.8.3): Encodes dbt-bigquery compatibility for publishing
  • dbt version (1.1.0): Valid semantic version for dbt's internal validation

This allows the adapter to support multiple dbt-bigquery versions while satisfying both PyPI and dbt requirements.


🚀 Quick Start

1. Get Your Rabbit API Key

Sign up for Rabbit and get your API key: https://followrabbit.ai

Contact: success@followrabbit.ai

2. Update Your profiles.yml

Change your profile type from bigquery to rabbitbigquery:

my_project:
  target: dev
  outputs:
    dev:
      type: rabbitbigquery  # Changed from 'bigquery'
      method: service-account
      project: my-gcp-project
      dataset: my_dataset
      threads: 4
      keyfile: /path/to/service-account.json
      location: US

      # Rabbit configuration (3 lines added)
      rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
      rabbit_default_pricing_mode: on_demand
      rabbit_reservation_ids: "project:us.my-reservation"

3. Set Environment Variables

export RABBIT_API_KEY="your-api-key-here"

4. Run dbt as usual

dbt run

That's it! All your queries are now automatically optimized. 🎉


📖 Configuration

Complete Configuration Example

my_project:
  target: prod
  outputs:
    prod:
      # Standard BigQuery configuration (unchanged)
      type: rabbitbigquery
      method: service-account
      project: my-gcp-project
      dataset: analytics
      threads: 8
      keyfile: /path/to/service-account.json
      location: US

      # Rabbit optimization configuration
      rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
      rabbit_default_pricing_mode: on_demand  # or 'slot_based'

      # Multiple reservations (comma-separated or list format)
      rabbit_reservation_ids: "project:us.res1,project:eu.res2"
      # Or as a list:
      # rabbit_reservation_ids:
      #   - "project:us.res1"
      #   - "project:eu.res2"

      # Optional: Custom Rabbit API URL (for enterprise)
      rabbit_base_url: https://api.followrabbit.ai

      # Optional: Disable optimization temporarily
      rabbit_enabled: true

Configuration Parameters

Required Parameters

Parameter Description Example
rabbit_api_key Your Rabbit API key "rb_1234..."
rabbit_default_pricing_mode Default pricing model "on_demand" or "slot_based"
rabbit_reservation_ids BigQuery reservation IDs "project:us.res1,project:eu.res2"

Optional Parameters

Parameter Default Description
rabbit_base_url Production API Custom API endpoint for enterprise
rabbit_enabled true Enable/disable optimization

Reservation ID Format

Reservation IDs should follow the BigQuery format:

project-id:location.reservation-name

Examples:

  • my-project:us-central1.reservation1
  • my-project:us.default-reservation
  • my-project:europe-west1.batch-processing

💡 How It Works

graph LR
    A[dbt SQL Model] --> B[Rabbit Adapter]
    B --> C{Analyze Query}
    C --> D[Rabbit API]
    D --> E{Optimize}
    E --> F[Assign Optimal Reservation]
    F --> G[BigQuery]
    G --> H[Results]
  1. Intercept: The adapter captures each BigQuery job configuration
  2. Analyze: Sends metadata to Rabbit API (query, project, reservations)
  3. Optimize: Rabbit analyzes query characteristics and assigns optimal pricing
  4. Execute: Job runs on BigQuery with optimized configuration
  5. Track: View savings and performance in Rabbit dashboard

What Gets Sent to Rabbit?

  • SQL query text
  • Job configuration (not your data)
  • Available reservation options

What Rabbit Returns:

  • Optimized job configuration
  • Assigned reservation
  • Expected cost savings

📊 Monitoring & Verification

View Logs

Console Output

dbt run --debug

Log File

cat logs/dbt.log | grep "RabbitBigQuery"

Example Log Output

INFO: Rabbit optimization enabled | Default pricing mode: on_demand | Reservations: ['project:us.res1']
INFO: Optimized Job executed successfully

Verify in Rabbit Dashboard

The easiest way to verify optimization and see cost savings:

  1. Log in to your Rabbit dashboard
  2. View optimized jobs in real-time
  3. See cost savings per query
  4. Track monthly savings trends

❓ FAQ & Common Concerns

Security & Privacy

Q: Does Rabbit have access to my data? A: No. Rabbit only receives job metadata (SQL queries and configuration), not your actual data. Your data never leaves BigQuery.

Q: How is my API key stored? A: We recommend using environment variables ({{ env_var('RABBIT_API_KEY') }}) to keep keys out of version control. The adapter never logs API keys.

Q: Is this SOC 2 compliant? A: Yes. Rabbit is SOC 2 Type II certified. Contact us for compliance documentation.

Performance

Q: What's the performance overhead? A: Typical API latency is 100-500ms per query. For long-running queries (>10 seconds), this is negligible (<5% overhead). For very short queries, the overhead is still minimal.

Q: Can I disable optimization for specific models? A: Yes. Set rabbit_enabled: false in your profile or use environment variables for specific runs.

Q: Does this affect dbt's multi-threading? A: No. dbt's threading works exactly as before. Optimization happens independently per thread.

Cost & ROI

Q: How much does Rabbit cost? A: Pricing is based on BigQuery spend or queries processed. Most customers save 5-10x more than the Rabbit fee. Contact success@followrabbit.ai for pricing.

Q: What if optimization makes things more expensive? A: Rabbit's algorithm is designed to always reduce costs. If optimization fails or would increase costs, it falls back to your original configuration.

Q: Can I see cost savings before committing? A: Yes. Use the Rabbit dashboard to see potential savings based on your query patterns. We also offer free trials.

Reliability

Q: What happens if Rabbit API is down? A: The adapter falls back to your original configuration gracefully. Your dbt jobs continue running normally with a warning logged.

Q: Will this break my existing dbt project? A: No. This is a drop-in replacement for dbt-bigquery. All standard dbt functionality works identically.

Q: Can I roll back quickly? A: Yes. Simply change type: rabbitbigquery back to type: bigquery in your profiles.yml. No code changes needed.

Integration

Q: Does this work with dbt Cloud? A: Currently, this adapter is designed for dbt Core. Contact us for dbt Cloud integration options.

Q: Can I use this with other dbt packages? A: Yes. This adapter is fully compatible with all dbt packages and features.

Q: Does this work with Airflow/Dagster/Prefect? A: Yes. Any orchestration tool that runs dbt Core will work seamlessly.


🔧 Advanced Usage

Disable Optimization for Specific Runs

# Via environment variable
DBT_RABBIT_ENABLED=false dbt run

# Or in profiles.yml
rabbit_enabled: false

Multiple Environments

my_project:
  target: prod
  outputs:
    dev:
      type: bigquery  # Standard adapter in dev
      # ... config ...

    prod:
      type: rabbitbigquery  # Optimize in production
      # ... config ...
      rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"

Regional Reservations

Use different reservations per region:

rabbit_reservation_ids:
  - "my-project:us-central1.us-reservation"
  - "my-project:europe-west1.eu-reservation"
  - "my-project:asia-east1.asia-reservation"

Rabbit automatically selects the optimal reservation based on data location.

Debug Mode

Enable detailed logging:

dbt run --debug 2>&1 | tee dbt-debug.log
grep "RabbitBigQuery" dbt-debug.log

🧪 Examples

See the examples/ directory for:

  • Complete dbt project setup
  • Multi-environment configurations
  • CI/CD integration examples
  • Custom optimization scenarios

Quick test:

cd examples/
./setup.sh
dbt run --select test_simple_query

📊 Compatibility

Component Version
dbt-core ≥1.5.0
dbt-bigquery 1.8.3, 1.9.2, 1.10.3 (see Installation for exact version matching)
Python ≥3.8
BigQuery API v2

Version Matching: You must install the dbt-rabbit-bigquery version that matches your dbt-bigquery version. See the Installation section above for details.


🐛 Troubleshooting

Adapter Not Found

Error: No module named 'dbt.adapters.rabbitbigquery'

Solution:

# First, check your dbt-bigquery version
pip show dbt-bigquery

# Install the matching adapter version (replace with your dbt-bigquery version)
pip install dbt-rabbit-bigquery==1.1.0.1.8.3  # For dbt-bigquery 1.8.3
# Or: pip install dbt-rabbit-bigquery==1.1.0.1.9.2  # For dbt-bigquery 1.9.2
# Or: pip install dbt-rabbit-bigquery==1.1.0.1.10.3  # For dbt-bigquery 1.10.3

dbt debug  # Verify installation

Version Mismatch or Validation Error

Error: "1.1.0.1.8.3" is not a valid semantic version or version validation errors

Solution:

  • Ensure you've installed the correct adapter version matching your dbt-bigquery version
  • Verify versions match:
    pip show dbt-bigquery dbt-rabbit-bigquery
    
  • The adapter version format is {base}.{dbt-bigquery-version} (e.g., 1.1.0.1.8.3 for dbt-bigquery==1.8.3)
  • If versions don't match, reinstall with the correct version:
    pip install --upgrade dbt-rabbit-bigquery==1.1.0.1.8.3  # Use your dbt-bigquery version
    

API Key Issues

Error: Rabbit optimization failed: Invalid API key

Solution:

  1. Verify environment variable is set: echo $RABBIT_API_KEY
  2. Check profiles.yml uses correct env_var syntax: "{{ env_var('RABBIT_API_KEY') }}"
  3. Ensure no trailing spaces or special characters

Optimization Not Working

Symptoms: No cost savings, Rabbit dashboard shows no activity

Debug steps:

# 1. Enable debug logging
dbt run --debug --select your_model

# 2. Check for Rabbit log entries
cat logs/dbt.log | grep -i rabbit

# 3. Verify configuration
dbt debug

# 4. Test API connectivity
python3 -c "from rabbit_bq_job_optimizer import RabbitBQJobOptimizer; \
  client = RabbitBQJobOptimizer(api_key='YOUR_KEY'); print('✓ Connected')"

Performance Issues

If you experience unusual slowness:

  1. Check Rabbit API status: status.followrabbit.ai
  2. Temporarily disable: rabbit_enabled: false
  3. Contact support with job IDs: success@followrabbit.ai

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for:

  • Development setup
  • Code style guidelines
  • Testing requirements
  • Pull request process

Quick Start for Contributors

# Clone and setup
git clone https://github.com/your-username/dbt-rabbit-bigquery.git
cd dbt-rabbit-bigquery
python3 -m venv venv
source venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

# Set up pre-commit hooks (runs checks automatically before each commit)
pre-commit install

# Run all checks manually
pre-commit run --all-files

Pre-commit hooks automatically run:

  • Black - Code formatting (config: pyproject.toml)
  • Flake8 - Linting (config: .flake8)
  • MyPy - Type checking (config: pyproject.toml)
  • Pydocstyle - Docstring validation (config: pyproject.toml)
  • File checks - Trailing whitespace, EOF, YAML validation

Note: All linting and formatting tools use centralized configuration files that are shared between pre-commit hooks and CI/CD pipelines, ensuring consistency.


📚 Additional Resources


📞 Support


📄 License

Apache License 2.0 - see LICENSE for details.


🗺️ Roadmap

  • Support for dbt Cloud
  • Additional optimization strategies (query rewriting, caching)
  • Real-time cost dashboards in dbt docs
  • Integration with Snowflake and Redshift
  • Auto-detection of optimal pricing models

⭐ Show Your Support

If this adapter saves you money, give us a star! ⭐

It helps others discover cost optimization for their dbt projects.


🙏 Acknowledgments

Built with ♥️ by the Rabbit team. Powered by:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_rabbit_bigquery-1.1.0.1.10.3.tar.gz (43.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_rabbit_bigquery-1.1.0.1.10.3-py3-none-any.whl (55.1 kB view details)

Uploaded Python 3

File details

Details for the file dbt_rabbit_bigquery-1.1.0.1.10.3.tar.gz.

File metadata

File hashes

Hashes for dbt_rabbit_bigquery-1.1.0.1.10.3.tar.gz
Algorithm Hash digest
SHA256 48884b353c92868e60e35856e4486e8a3b26b26067461f855ec15a866f9f43c7
MD5 6659422dd71f4fc8f46288ecf431e72e
BLAKE2b-256 fca0467379aa135390c21ed1c352a4ac634ef5d047f13db7873c2c5b50ce0304

See more details on using hashes here.

File details

Details for the file dbt_rabbit_bigquery-1.1.0.1.10.3-py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_rabbit_bigquery-1.1.0.1.10.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b7b1954e665429f200126c569038052611a38490f2d640d0adc844343abfd7cf
MD5 9e0ed80826f5f179f47cdcc823d2ef08
BLAKE2b-256 ca82d9216cab5c65b8a2d06f8a93835f4693ef3775cca4a86f3e5674f3d44008

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page