A dbt adapter that automatically optimizes BigQuery job costs using the Rabbit API
Project description
dbt-rabbit-bigquery
Automatically optimize your BigQuery costs in dbt without changing a single line of SQL.
The dbt-rabbit-bigquery adapter is a drop-in replacement for dbt-bigquery that intelligently routes your queries to the most cost-effective BigQuery resources using the Rabbit optimization platform. Save up to 60% on BigQuery costs while maintaining full compatibility with your existing dbt projects.
🎯 Why Use This Adapter?
The Problem
BigQuery offers multiple pricing options (on-demand, flat-rate slots, reservations), but choosing the right option for each query is complex and time-consuming. Most teams either:
- Overpay by using on-demand pricing for everything
- Underutilize expensive slot commitments
- Spend engineering time manually optimizing queries
The Solution
This adapter automatically analyzes each query and assigns it to the optimal BigQuery pricing model, ensuring you always get the best performance at the lowest cost—without any code changes.
Key Benefits
- ✅ Zero Code Changes: Drop-in replacement for dbt-bigquery
- 💰 Automatic Cost Optimization: Save up to 60% on BigQuery costs
- 🚀 No Performance Impact: Sub-second API overhead
- 🛡️ Production Ready: Graceful fallback if optimization fails
- 📊 Full Transparency: Detailed logging and cost trackin
📦 Installation
Important: You must install the version that matches your dbt-bigquery version.
Step 1: Check Your dbt-bigquery Version
pip show dbt-bigquery
# Look for: Version: 1.8.3 (or 1.9.2, 1.10.3, etc.)
Step 2: Install the Matching Adapter Version
The adapter version format is {base_version}.{dbt-bigquery_version}. Install the version that matches your dbt-bigquery:
# For dbt-bigquery 1.8.3
pip install dbt-rabbit-bigquery==1.1.0.1.8.3
# For dbt-bigquery 1.9.2
pip install dbt-rabbit-bigquery==1.1.0.1.9.2
# For dbt-bigquery 1.10.3
pip install dbt-rabbit-bigquery==1.1.0.1.10.3
Installation in Requirements Files
requirements.txt:
dbt-bigquery==1.8.3
dbt-rabbit-bigquery==1.1.0.1.8.3
pyproject.toml:
[project]
dependencies = [
"dbt-bigquery==1.8.3",
"dbt-rabbit-bigquery==1.1.0.1.8.3",
]
Poetry (pyproject.toml):
[tool.poetry.dependencies]
dbt-bigquery = "1.8.3"
dbt-rabbit-bigquery = "1.1.0.1.8.3"
Supported Versions
| dbt-bigquery | dbt-rabbit-bigquery | Status |
|---|---|---|
| 1.8.3 | 1.1.0.1.8.3 | ✅ Supported |
| 1.9.2 | 1.1.0.1.9.2 | ✅ Supported |
| 1.10.3 | 1.1.0.1.10.3 | ✅ Supported |
Note: Always use the exact version match. The adapter version must match your dbt-bigquery version for compatibility.
Why This Versioning?
The adapter uses a hybrid versioning approach:
- PyPI version (
1.1.0.1.8.3): Encodesdbt-bigquerycompatibility for publishing - dbt version (
1.1.0): Valid semantic version for dbt's internal validation
This allows the adapter to support multiple dbt-bigquery versions while satisfying both PyPI and dbt requirements.
🚀 Quick Start
1. Get Your Rabbit API Key
Sign up for Rabbit and get your API key: https://followrabbit.ai
Contact: success@followrabbit.ai
2. Update Your profiles.yml
Change your profile type from bigquery to rabbitbigquery:
my_project:
target: dev
outputs:
dev:
type: rabbitbigquery # Changed from 'bigquery'
method: service-account
project: my-gcp-project
dataset: my_dataset
threads: 4
keyfile: /path/to/service-account.json
location: US
# Rabbit configuration (3 lines added)
rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
rabbit_default_pricing_mode: on_demand
rabbit_reservation_ids: "project:us.my-reservation"
3. Set Environment Variables
export RABBIT_API_KEY="your-api-key-here"
4. Run dbt as usual
dbt run
That's it! All your queries are now automatically optimized. 🎉
📖 Configuration
Complete Configuration Example
my_project:
target: prod
outputs:
prod:
# Standard BigQuery configuration (unchanged)
type: rabbitbigquery
method: service-account
project: my-gcp-project
dataset: analytics
threads: 8
keyfile: /path/to/service-account.json
location: US
# Rabbit optimization configuration
rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
rabbit_default_pricing_mode: on_demand # or 'slot_based'
# Multiple reservations (comma-separated or list format)
rabbit_reservation_ids: "project:us.res1,project:eu.res2"
# Or as a list:
# rabbit_reservation_ids:
# - "project:us.res1"
# - "project:eu.res2"
# Optional: Custom Rabbit API URL (for enterprise)
rabbit_base_url: https://api.followrabbit.ai
# Optional: Disable optimization temporarily
rabbit_enabled: true
Configuration Parameters
Required Parameters
| Parameter | Description | Example |
|---|---|---|
rabbit_api_key |
Your Rabbit API key | "rb_1234..." |
rabbit_default_pricing_mode |
Default pricing model | "on_demand" or "slot_based" |
rabbit_reservation_ids |
BigQuery reservation IDs | "project:us.res1,project:eu.res2" |
Optional Parameters
| Parameter | Default | Description |
|---|---|---|
rabbit_base_url |
Production API | Custom API endpoint for enterprise |
rabbit_enabled |
true |
Enable/disable optimization |
Reservation ID Format
Reservation IDs should follow the BigQuery format:
project-id:location.reservation-name
Examples:
my-project:us-central1.reservation1my-project:us.default-reservationmy-project:europe-west1.batch-processing
💡 How It Works
graph LR
A[dbt SQL Model] --> B[Rabbit Adapter]
B --> C{Analyze Query}
C --> D[Rabbit API]
D --> E{Optimize}
E --> F[Assign Optimal Reservation]
F --> G[BigQuery]
G --> H[Results]
- Intercept: The adapter captures each BigQuery job configuration
- Analyze: Sends metadata to Rabbit API (query, project, reservations)
- Optimize: Rabbit analyzes query characteristics and assigns optimal pricing
- Execute: Job runs on BigQuery with optimized configuration
- Track: View savings and performance in Rabbit dashboard
What Gets Sent to Rabbit?
- SQL query text
- Job configuration (not your data)
- Available reservation options
What Rabbit Returns:
- Optimized job configuration
- Assigned reservation
- Expected cost savings
📊 Monitoring & Verification
View Logs
Console Output
dbt run --debug
Log File
cat logs/dbt.log | grep "RabbitBigQuery"
Example Log Output
INFO: Rabbit optimization enabled | Default pricing mode: on_demand | Reservations: ['project:us.res1']
INFO: Optimized Job executed successfully
Verify in Rabbit Dashboard
The easiest way to verify optimization and see cost savings:
- Log in to your Rabbit dashboard
- View optimized jobs in real-time
- See cost savings per query
- Track monthly savings trends
❓ FAQ & Common Concerns
Security & Privacy
Q: Does Rabbit have access to my data? A: No. Rabbit only receives job metadata (SQL queries and configuration), not your actual data. Your data never leaves BigQuery.
Q: How is my API key stored?
A: We recommend using environment variables ({{ env_var('RABBIT_API_KEY') }}) to keep keys out of version control. The adapter never logs API keys.
Q: Is this SOC 2 compliant? A: Yes. Rabbit is SOC 2 Type II certified. Contact us for compliance documentation.
Performance
Q: What's the performance overhead? A: Typical API latency is 100-500ms per query. For long-running queries (>10 seconds), this is negligible (<5% overhead). For very short queries, the overhead is still minimal.
Q: Can I disable optimization for specific models?
A: Yes. Set rabbit_enabled: false in your profile or use environment variables for specific runs.
Q: Does this affect dbt's multi-threading? A: No. dbt's threading works exactly as before. Optimization happens independently per thread.
Cost & ROI
Q: How much does Rabbit cost? A: Pricing is based on BigQuery spend or queries processed. Most customers save 5-10x more than the Rabbit fee. Contact success@followrabbit.ai for pricing.
Q: What if optimization makes things more expensive? A: Rabbit's algorithm is designed to always reduce costs. If optimization fails or would increase costs, it falls back to your original configuration.
Q: Can I see cost savings before committing? A: Yes. Use the Rabbit dashboard to see potential savings based on your query patterns. We also offer free trials.
Reliability
Q: What happens if Rabbit API is down? A: The adapter falls back to your original configuration gracefully. Your dbt jobs continue running normally with a warning logged.
Q: Will this break my existing dbt project? A: No. This is a drop-in replacement for dbt-bigquery. All standard dbt functionality works identically.
Q: Can I roll back quickly?
A: Yes. Simply change type: rabbitbigquery back to type: bigquery in your profiles.yml. No code changes needed.
Integration
Q: Does this work with dbt Cloud? A: Currently, this adapter is designed for dbt Core. Contact us for dbt Cloud integration options.
Q: Can I use this with other dbt packages? A: Yes. This adapter is fully compatible with all dbt packages and features.
Q: Does this work with Airflow/Dagster/Prefect? A: Yes. Any orchestration tool that runs dbt Core will work seamlessly.
🔧 Advanced Usage
Disable Optimization for Specific Runs
# Via environment variable
DBT_RABBIT_ENABLED=false dbt run
# Or in profiles.yml
rabbit_enabled: false
Multiple Environments
my_project:
target: prod
outputs:
dev:
type: bigquery # Standard adapter in dev
# ... config ...
prod:
type: rabbitbigquery # Optimize in production
# ... config ...
rabbit_api_key: "{{ env_var('RABBIT_API_KEY') }}"
Regional Reservations
Use different reservations per region:
rabbit_reservation_ids:
- "my-project:us-central1.us-reservation"
- "my-project:europe-west1.eu-reservation"
- "my-project:asia-east1.asia-reservation"
Rabbit automatically selects the optimal reservation based on data location.
Debug Mode
Enable detailed logging:
dbt run --debug 2>&1 | tee dbt-debug.log
grep "RabbitBigQuery" dbt-debug.log
🧪 Examples
See the examples/ directory for:
- Complete dbt project setup
- Multi-environment configurations
- CI/CD integration examples
- Custom optimization scenarios
Quick test:
cd examples/
./setup.sh
dbt run --select test_simple_query
📊 Compatibility
| Component | Version |
|---|---|
| dbt-core | ≥1.5.0 |
| dbt-bigquery | 1.8.3, 1.9.2, 1.10.3 (see Installation for exact version matching) |
| Python | ≥3.8 |
| BigQuery API | v2 |
Version Matching: You must install the dbt-rabbit-bigquery version that matches your dbt-bigquery version. See the Installation section above for details.
🐛 Troubleshooting
Adapter Not Found
Error: No module named 'dbt.adapters.rabbitbigquery'
Solution:
# First, check your dbt-bigquery version
pip show dbt-bigquery
# Install the matching adapter version (replace with your dbt-bigquery version)
pip install dbt-rabbit-bigquery==1.1.0.1.8.3 # For dbt-bigquery 1.8.3
# Or: pip install dbt-rabbit-bigquery==1.1.0.1.9.2 # For dbt-bigquery 1.9.2
# Or: pip install dbt-rabbit-bigquery==1.1.0.1.10.3 # For dbt-bigquery 1.10.3
dbt debug # Verify installation
Version Mismatch or Validation Error
Error: "1.1.0.1.8.3" is not a valid semantic version or version validation errors
Solution:
- Ensure you've installed the correct adapter version matching your
dbt-bigqueryversion - Verify versions match:
pip show dbt-bigquery dbt-rabbit-bigquery
- The adapter version format is
{base}.{dbt-bigquery-version}(e.g.,1.1.0.1.8.3fordbt-bigquery==1.8.3) - If versions don't match, reinstall with the correct version:
pip install --upgrade dbt-rabbit-bigquery==1.1.0.1.8.3 # Use your dbt-bigquery version
API Key Issues
Error: Rabbit optimization failed: Invalid API key
Solution:
- Verify environment variable is set:
echo $RABBIT_API_KEY - Check profiles.yml uses correct env_var syntax:
"{{ env_var('RABBIT_API_KEY') }}" - Ensure no trailing spaces or special characters
Optimization Not Working
Symptoms: No cost savings, Rabbit dashboard shows no activity
Debug steps:
# 1. Enable debug logging
dbt run --debug --select your_model
# 2. Check for Rabbit log entries
cat logs/dbt.log | grep -i rabbit
# 3. Verify configuration
dbt debug
# 4. Test API connectivity
python3 -c "from rabbit_bq_job_optimizer import RabbitBQJobOptimizer; \
client = RabbitBQJobOptimizer(api_key='YOUR_KEY'); print('✓ Connected')"
Performance Issues
If you experience unusual slowness:
- Check Rabbit API status: status.followrabbit.ai
- Temporarily disable:
rabbit_enabled: false - Contact support with job IDs: success@followrabbit.ai
🤝 Contributing
We welcome contributions! See CONTRIBUTING.md for:
- Development setup
- Code style guidelines
- Testing requirements
- Pull request process
Quick Start for Contributors
# Clone and setup
git clone https://github.com/your-username/dbt-rabbit-bigquery.git
cd dbt-rabbit-bigquery
python3 -m venv venv
source venv/bin/activate
# Install with dev dependencies
pip install -e ".[dev]"
# Set up pre-commit hooks (runs checks automatically before each commit)
pre-commit install
# Run all checks manually
pre-commit run --all-files
Pre-commit hooks automatically run:
- Black - Code formatting (config:
pyproject.toml) - Flake8 - Linting (config:
.flake8) - MyPy - Type checking (config:
pyproject.toml) - Pydocstyle - Docstring validation (config:
pyproject.toml) - File checks - Trailing whitespace, EOF, YAML validation
Note: All linting and formatting tools use centralized configuration files that are shared between pre-commit hooks and CI/CD pipelines, ensuring consistency.
📚 Additional Resources
- Rabbit Documentation: docs.followrabbit.ai
- dbt Documentation: docs.getdbt.com
- BigQuery Reservations: cloud.google.com/bigquery/docs/reservations
- Blog: Optimizing BigQuery Costs in dbt: [Coming Soon]
📞 Support
- Email: success@followrabbit.ai
- Issues: GitHub Issues
- Documentation: This README + examples/
- Security Issues: security@followrabbit.ai
📄 License
Apache License 2.0 - see LICENSE for details.
🗺️ Roadmap
- Support for dbt Cloud
- Additional optimization strategies (query rewriting, caching)
- Real-time cost dashboards in dbt docs
- Integration with Snowflake and Redshift
- Auto-detection of optimal pricing models
⭐ Show Your Support
If this adapter saves you money, give us a star! ⭐
It helps others discover cost optimization for their dbt projects.
🙏 Acknowledgments
Built with ♥️ by the Rabbit team. Powered by:
- dbt - The best data transformation tool
- BigQuery - Google's data warehouse
- rabbit-bq-job-optimizer - Core optimization library
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_rabbit_bigquery-1.1.0.1.9.2.tar.gz.
File metadata
- Download URL: dbt_rabbit_bigquery-1.1.0.1.9.2.tar.gz
- Upload date:
- Size: 43.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da9a90ec2ff40f49da77c8339adf22230f10c495eddf709b73be83a9c25dbe19
|
|
| MD5 |
217117b94bc0f911d94cfd37b832b715
|
|
| BLAKE2b-256 |
ac13af53456c27f50f70fc9016ae387dd7583cebff9d9bbad02bd14ae148341d
|
File details
Details for the file dbt_rabbit_bigquery-1.1.0.1.9.2-py3-none-any.whl.
File metadata
- Download URL: dbt_rabbit_bigquery-1.1.0.1.9.2-py3-none-any.whl
- Upload date:
- Size: 54.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
750a20e6cc4535dc4a0c81d9773dd66c26e1bdc3d93c43bfcc39f4e4b7afed37
|
|
| MD5 |
545cc8db239bd812614f02b5c9298c6c
|
|
| BLAKE2b-256 |
1e7549d11dc3c2283fe6a45e81d25b915b972af90cdb8546717e9ca1f53d9067
|