Metabase query API with any URL and easy to filter.
Project description
Metabase query
This package will help Data workers get data from Metabase questions more easily and effectively. It only focuses on the Card Query API, Dataset API and does not include other Metabase APIs.
It is allowing you to execute queries using URLs or SQL directly within your Python code. Whether you are working with saved questions, SQL queries, or need to filter large datasets across multiple queries, this package offers a streamlined and flexible approach to retrieving data from your Metabase. With options for handling retries, connection limits, and custom filters, it’s built to handle complex querying needs with ease.
Features
- Get question data in any data format provided by Metabase (JSON, CSV, XLSX).
- Input question URL and Metabase Session. No need to provide parameters payload.
- JSON results have the same column sort order as the browser.
- Automatically check if Metabase session is available.
- Easy to filter data with a simple dict.
- Allow retry if an error occurs due to server slowdown.
- Allow entering multiple filter values in bulk.
- Support both saved questions and unsaved questions.
- Support SQL query.
Installation
pip install --upgrade metabase-query
Quick start example
Create Metabase object
Simple syntax:
from metabase_query import Metabase
mb = Metabase(metabase_session='YourMetabaseSession')
Full options syntax:
mb = Metabase(metabase_session='YourMetabaseSession', retry_errors=None, retry_attempts=3, limit_per_host=5, timeout=600, verbose=True, domain=None)
metabase_session
: Your Metabase Session.retry_errors
:None
to retry with any error, a list of errors to retry with these errors only, contain matching. Default isNone
.retry_attempts
: 0 will not retry. Default is3
.limit_per_host
: The limit of connections per host. Default is5
.timeout
: Timeout in seconds for each connection. Default is600
.verbose
: Print log or not. Default isTrue
.domain
: Not required for queries with URL, SQL queries is required. Default isNone
.
Query with any URL
Example URLs:
url = 'https://your-domain.com/question/123456-example?created_at=past3months' # Saved question
url = 'https://your-domain.com/question#eW91cl9xdWVyeQ==...' # Table or unsaved SQL query
Don't need to find parameters from payload, just paste the URL from the browser.
data = mb.query(url=url, format='json')
format
: Support JSON, CSV, XLSX formats. Default is'json'
.
Add filters easily with a dict.
filter = {
'order_id': [123456, 456789, 789012], # Unlimited values in list, WOW!
'status': 'Completed'
}
data = mb.query(url=url, filter=filter, filter_chunk_size=5000)
filter
: One dict for a list of dicts.filter_chunk_size
: If you have a bulk value filter, the package will splits your values into chunks to send the requests, and then concat the results into a single data.
One URL with multiple filters.
filters = [
{'created_at': '2024-08-01~2024-08-05'},
{'created_at': '2024-08-06~2024-08-10'},
{'created_at': '2024-08-11~2024-08-15'}
]
results = mb.query(url=url, filter=filters)
# We support a function to combine JSON and CSV data. Use it if you see data sets are the same columns.
from metabase_query.utils import combine_results
data = combine_results(results=[r['data'] for r in results], format='json')
Many URLs.
urls = [
'https://your-domain.com/question/123456-example?created_at=2024-08-01~2024-08-05',
'https://your-domain.com/question/123456-example?created_at=2024-08-06~2024-08-10',
'https://your-domain.com/question/123456-example?created_at=2024-08-11~2024-08-15'
]
results = mb.query(url=urls)
A URL list with a filter list.
urls = [
'https://your-domain.com/question/123456-example',
'https://your-domain.com/question/123456-example',
'https://your-domain.com/question/123456-example'
]
filters = [
{'created_at': '2024-08-01~2024-08-05'},
{'created_at': '2024-08-06~2024-08-10'},
{'created_at': '2024-08-11~2024-08-15'}
]
results = mb.query(url=urls, filter=filters)
SQL query
sql = '''
SELECT * FROM your_table LIMIT 1000
'''
database = '1-presto'
data = mb.sql_query(sql=sql, database=database, format='json')
sql
: One SQL query or a list of SQL queries.database
: One database ID or a list or database IDs follow SQL list. Look at the database slug on the browser.
Many SQL queries.
sql_1 = '''
SELECT * FROM your_table WHERE created_at BETWEEN DATE '2024-08-01' AND '2024-08-05'
'''
sql_2 = '''
SELECT * FROM your_table WHERE created_at BETWEEN DATE '2024-08-06' AND '2024-08-10'
'''
sqls = [sql_1, sql_2]
results = mb.sql_query(sql=sqls, database=database)
Goood luck!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file metabase_query-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: metabase_query-1.0.1-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a26aacc7c264225ee46a3ae6f3b5b8a6bf305ea19aaa0751aa7ec16e78367e21 |
|
MD5 | 91b00ab4e6c6a0e7acdff29316b8f6bd |
|
BLAKE2b-256 | 82349318b49cb6e312c1f2b049e7089aaa3d48061598f8f55ee20429278e548b |