Metabase query API with any URL and easy to filter.
Project description
Metabase query
metabase-query is a Python package designed to simplify data retrieval from Metabase, specifically focusing on the Card Query API and Dataset API. This package allows data professionals to execute queries using URLs or SQL directly within their Python code, facilitating streamlined access to Metabase data.
Key features
- Flexible Data Retrieval: Retrieve data in any format supported by Metabase, including JSON, CSV, and XLSX.
- Simple Integration: Execute queries by simply inputting the question URL and Metabase session—no need to manually provide parameters.
- Consistent Results: JSON results maintain the same column order as displayed in the Metabase UI.
- Session Management: Automatically checks the availability of the Metabase session.
- Effortless Filtering: Easily apply filters to your queries using simple dictionaries.
- Error Handling: Supports automatic retries in case of server errors or slowdowns.
- Bulk Filter Support: Allows entering multiple filter values in a single request.
- Query Versatility: Supports both saved and unsaved questions, as well as SQL queries.
- Utilize asynchronous libraries: Send multiple requests concurrently to retrieve data efficiently.
Installation
To install the package, use the following pip command:
pip install --upgrade metabase-query
Usage
Basic Example
from metabase_query import Metabase
# Initialize the MetabaseQuery object
mb = Metabase(metabase_session='YourMetabaseSession')
# Query data using a Metabase question URL
url = 'https://your-domain.com/question/123456-example?created_at=past3months'
data = mb.query(url=url, format='json')
Table URL and Unsaved question URL
url = 'https://your-domain.com/question#eyJkYXRhc2V0X3F1ZXJ5Ijp7ImRhdGFiYXNlIjo2LCJxdWVyeSI6eyJzb3VyY2UtdGFibGUiOjQ4MzV9LCJ0eXBlIjoicXVlcnkifSwiZGlzcGxheSI6InRhYmxlIiwidmlzdWFsaXphdGlvbl9zZXR0aW5ncyI6e319'
data = mb.query(url=url, format='csv')
# Example saving data to a CSV file.
with open('data.csv', 'rb') as f:
f.write(data)
Advanced Settings
mb = Metabase(metabase_session='YourMetabaseSession', retry_errors=None, retry_attempts=3, limit_per_host=5, timeout=600, verbose=True, domain=None)
metabase_session
: Your Metabase Session.retry_errors
: Set toNone
to retry on any error, or provide a list of specific errors to retry only for those. Default isNone
.retry_attempts
: The number of retry attempts in case of an error. Default is3
; set to0
to disable retries.limit_per_host
: The maximum number of connections allowed per host. Default is5
.timeout
: The timeout duration in seconds for each connection. Default is600
.verbose
: Whether to print logs. Default isTrue
.domain
: Not required for URL-based queries, but mandatory for SQL queries. Default isNone
.
Working with Filters
Simple Filter
It will combine both filter in URL and filter dictionary. Priority filter dictionary if it exists on URL.
filter = {
'order_id': [123456, 456789, 789012], # UNLIMITED values!!!
'status': 'Completed'
}
data = mb.query(url=url, filter=filter, filter_chunk_size=5000)
filter
: A single dictionary or a list of dictionaries representing the filters.filter_chunk_size
: For bulk filter values, the package will divide the values into manageable chunks for processing, then combine the results into a single dataset.
Single URL with Multiple Filters
filters = [
{'created_at': '2024-08-01~2024-08-05'},
{'created_at': '2024-08-06~2024-08-10'},
{'created_at': '2024-08-11~2024-08-15'}
]
results = mb.query(url=url, filter=filters)
# Combine results if the datasets have the same columns
from metabase_query.utils import combine_results
data = combine_results(results=[r['data'] for r in results], format='json')
Multiple URLs
It can use with a filter dictionary if needed.
urls = [
'https://your-domain.com/question/123456-example?created_at=2024-08-01~2024-08-05',
'https://your-domain.com/question/123456-example?created_at=2024-08-06~2024-08-10',
'https://your-domain.com/question/123456-example?created_at=2024-08-11~2024-08-15'
]
results = mb.query(url=urls)
URL List with Filter List
urls = [
'https://your-domain.com/question/123456-example', # 1
'https://your-domain.com/question/123456-example', # 2
'https://your-domain.com/question/123456-example' # 3
]
filters = [
{'created_at': '2024-08-01~2024-08-05'}, # For URL 1
{'created_at': '2024-08-06~2024-08-10'}, # For URL 2
{'created_at': '2024-08-11~2024-08-15'} # For URL 3
]
results = mb.query(url=urls, filter=filters)
Executing SQL Queries
sql = '''
SELECT * FROM your_table LIMIT 1000
'''
database = '1-presto'
data = mb.sql(sql=sql, database=database, format='json')
sql
: A single SQL query or a list of SQL queries.database
: A single database ID or a list of database IDs corresponding to the SQL queries. Refer to the database slug in the browser for details.
Multiple SQL Queries
sql_1 = '''
SELECT * FROM your_table WHERE created_at BETWEEN DATE '2024-08-01' AND '2024-08-05'
'''
sql_2 = '''
SELECT * FROM your_table WHERE created_at BETWEEN DATE '2024-08-06' AND '2024-08-10'
'''
sqls = [sql_1, sql_2]
results = mb.sql(sql=sqls, database=database)
Contributing
Contributions are welcome! Please refer to the issues page for ways you can help.
Good luck with your data queries!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file metabase_query-1.0.3-py3-none-any.whl
.
File metadata
- Download URL: metabase_query-1.0.3-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f99babdfed45c9631ebc5b9e81208745a276da0b0e2d1c3ece5fe9ced1b96e5 |
|
MD5 | a66976007fa960ff28d363f24814133c |
|
BLAKE2b-256 | 669edd593399037ab0b648aa1df2a0700dd5828a3c58af2b5c494558d705a775 |