Metabase query API with any URL and easy to filter.

Project description

Metabase query

This package will help Data workers get data from Metabase questions more easily and effectively. It only focuses on the Card Query API, Dataset API and does not include other Metabase APIs.

It is allowing you to execute queries using URLs or SQL directly within your Python code. Whether you are working with saved questions, SQL queries, or need to filter large datasets across multiple queries, this package offers a streamlined and flexible approach to retrieving data from your Metabase. With options for handling retries, connection limits, and custom filters, it’s built to handle complex querying needs with ease.

Features

Get question data in any data format provided by Metabase (JSON, CSV, XLSX).
Input question URL and Metabase Session. No need to provide parameters payload.
JSON results have the same column sort order as the browser.
Automatically check if Metabase session is available.
Easy to filter data with a simple dict.
Allow retry if an error occurs due to server slowdown.
Allow entering multiple filter values in bulk.
Support both saved questions and unsaved questions.
Support SQL query.

Installation

pip install --upgrade metabase-query

Quick start example

Create Metabase object

Simple syntax:

from metabase_query import Metabase

mb = Metabase(metabase_session='YourMetabaseSession')

Full options syntax:

mb = Metabase(metabase_session='YourMetabaseSession',  retry_errors=None, retry_attempts=3, limit_per_host=5, timeout=600, verbose=True, domain=None)

metabase_session: Your Metabase Session.
retry_errors: None to retry with any error, a list of errors to retry with these errors only, contain matching. Default is None.
retry_attempts: 0 will not retry. Default is 3.
limit_per_host: The limit of connections per host. Default is 5.
timeout: Timeout in seconds for each connection. Default is 600.
verbose: Print log or not. Default is True.
domain: Not required for queries with URL, SQL queries is required. Default is None.

Query with any URL

Example URLs:

url = 'https://your-domain.com/question/123456-example?created_at=past3months' # Saved question
url = 'https://your-domain.com/question#eW91cl9xdWVyeQ==...' # Table or unsaved SQL query

Don't need to find parameters from payload, just paste the URL from the browser.

data = mb.query(url=url, format='json')

format: Support JSON, CSV, XLSX formats. Default is 'json'.

Add filters easily with a dict.

filter = {
    'order_id': [123456, 456789, 789012], # Unlimited values in list, WOW!
    'status': 'Completed'
}
data = mb.query(url=url, filter=filter, filter_chunk_size=5000)

filter: One dict for a list of dicts.
filter_chunk_size: If you have a bulk value filter, the package will splits your values into chunks to send the requests, and then concat the results into a single data.

One URL with multiple filters.

filters = [
    {'created_at': '2024-08-01~2024-08-05'},
    {'created_at': '2024-08-06~2024-08-10'},
    {'created_at': '2024-08-11~2024-08-15'}
]

results = mb.query(url=url, filter=filters)

# We support a function to combine JSON and CSV data. Use it if you see data sets are the same columns.
from metabase_query.utils import combine_results
data = combine_results(results=[r['data'] for r in results], format='json')

Many URLs.

urls = [
    'https://your-domain.com/question/123456-example?created_at=2024-08-01~2024-08-05',
    'https://your-domain.com/question/123456-example?created_at=2024-08-06~2024-08-10',
    'https://your-domain.com/question/123456-example?created_at=2024-08-11~2024-08-15'
]
results = mb.query(url=urls)

A URL list with a filter list.

urls = [
    'https://your-domain.com/question/123456-example',
    'https://your-domain.com/question/123456-example',
    'https://your-domain.com/question/123456-example'
]

filters = [
    {'created_at': '2024-08-01~2024-08-05'},
    {'created_at': '2024-08-06~2024-08-10'},
    {'created_at': '2024-08-11~2024-08-15'}
]

results = mb.query(url=urls, filter=filters)

SQL query

sql = '''
SELECT * FROM your_table LIMIT 1000
'''

database = '1-presto'

data = mb.sql_query(sql=sql, database=database, format='json')

sql: One SQL query or a list of SQL queries.
database: One database ID or a list or database IDs follow SQL list. Look at the database slug on the browser.

Many SQL queries.

sql_1 = '''
SELECT * FROM your_table WHERE created_at BETWEEN DATE '2024-08-01' AND '2024-08-05'
'''

sql_2 = '''
SELECT * FROM your_table WHERE created_at BETWEEN DATE '2024-08-06' AND '2024-08-10'
'''

sqls = [sql_1, sql_2]

results = mb.sql_query(sql=sqls, database=database)

Goood luck!

Project details

Release history Release notifications | RSS feed

1.0.6

Oct 17, 2024

1.0.5

Aug 19, 2024

1.0.4

Aug 18, 2024

1.0.3

Aug 18, 2024

1.0.2

Aug 18, 2024

This version

1.0.1

Aug 18, 2024

1.0.0

Aug 18, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

metabase_query-1.0.1-py3-none-any.whl (14.1 kB view details)

Uploaded Aug 18, 2024 Python 3

File details

Details for the file metabase_query-1.0.1-py3-none-any.whl.

File metadata

Download URL: metabase_query-1.0.1-py3-none-any.whl
Upload date: Aug 18, 2024
Size: 14.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for metabase_query-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a26aacc7c264225ee46a3ae6f3b5b8a6bf305ea19aaa0751aa7ec16e78367e21`
MD5	`91b00ab4e6c6a0e7acdff29316b8f6bd`
BLAKE2b-256	`82349318b49cb6e312c1f2b049e7089aaa3d48061598f8f55ee20429278e548b`