Skip to main content

A Python client and CLI for the Metabase API with automatic chunking for large dataset exports

Project description

PyMetabase

The most complete Python API wrapper and CLI for Metabase

PyPI Python License Tests Downloads


Export millions of rows, manage cards, dashboards, collections, execute parameterized queries, and access every Metabase API endpoint โ€” from Python or the command line.

๐Ÿ” Why PyMetabase?

Feature PyMetabase Other Libraries
Full Metabase API coverage โœ… Partial
CLI tool โœ… โŒ
Automatic chunking for large exports โœ… โŒ
Parameterized queries ({{param}}) โœ… โŒ
Session token persistence & reuse โœ… โŒ
Resume interrupted exports โœ… โŒ
rclone-style remote management โœ… โŒ
Cards, Dashboards, Collections API โœ… Rare
Generic API access (any endpoint) โœ… Rare
Search, Users, Tables/Fields API โœ… โŒ

๐Ÿ“ฆ Installation

pip install pymetabase

Optional extras:

pip install pymetabase[excel]    # Excel export (pandas + openpyxl)
pip install pymetabase[parquet]  # Parquet export (pyarrow)

โšก Quick Start

Python โ€” Export your first dataset in 4 lines

from pymetabase import Metabase

with Metabase(url="https://metabase.example.com", username="analyst@company.com", password="pass") as mb:
    result = mb.export(database="Production", query="SELECT * FROM customers", output="customers.jsonl")
    print(f"Exported {result.total_rows:,} rows in {result.duration_seconds:.1f}s")

CLI โ€” Zero code, full power

# Step 1: Configure your Metabase server (one-time setup)
pymetabase config

# Step 2: Start working
pymetabase export -d Production -q "SELECT * FROM customers" -o customers.jsonl
pymetabase cards                    # see all saved questions
pymetabase search "revenue"         # find anything across your instance
pymetabase query -d Production -q "SELECT COUNT(*) FROM orders WHERE status = {{status}}" -p status=completed

๐Ÿ“– Features

๐Ÿ”Œ Generic API Access

Hit any Metabase endpoint โ€” no wrapper method needed:

with Metabase(url=url, username=user, password=pwd) as mb:
    # Fetch all cards
    all_cards = mb.get("/api/card")

    # Create a new question
    mb.post("/api/card", json={
        "name": "Daily Active Users",
        "dataset_query": {"database": 1, "type": "native", "native": {"query": "SELECT ..."}},
        "display": "line",
        "visualization_settings": {},
    })

    # Update and delete
    mb.put("/api/card/1", json={"name": "Weekly Active Users"})
    mb.delete_resource("/api/card/1")
# Same from the command line
pymetabase api get /api/card
pymetabase api post /api/card --json-body '{"name": "New Question", ...}'

๐Ÿ“‹ Cards (Saved Questions)

with Metabase(...) as mb:
    # Browse saved questions
    cards = mb.list_cards()
    card = mb.get_card(42)

    # Run a saved question with parameters
    result = mb.execute_card(42, parameters={"date_filter": "2024-01-01"})

    # Create a new question programmatically
    mb.create_card(
        name="Monthly Revenue by Region",
        dataset_query={
            "database": 1,
            "type": "native",
            "native": {"query": "SELECT region, SUM(amount) FROM sales GROUP BY region"},
        },
        collection_id=5,
    )

    # Housekeeping
    mb.update_card(42, name="Monthly Revenue by Region (v2)")
    mb.delete_card(99)

๐Ÿ“Š Dashboards

with Metabase(...) as mb:
    dashboards = mb.list_dashboards()
    dash = mb.get_dashboard(1)   # includes all cards on the dashboard

    mb.create_dashboard("Q4 Executive Report", collection_id=3, description="KPIs and trends")
    mb.update_dashboard(1, name="Q4 Executive Report โ€” Final")
    mb.delete_dashboard(1)

๐Ÿ“ Collections

with Metabase(...) as mb:
    collections = mb.list_collections()
    items = mb.get_collection_items("root")   # list everything at the top level
    items = mb.get_collection_items(7)         # or inside a specific collection

    mb.create_collection("Data Engineering", parent_id=1, color="#509EE3")

๐Ÿ”Ž Search

Find cards, dashboards, collections, and tables in one call:

results = mb.search("revenue", models=["card", "dashboard"])
pymetabase search "revenue"

๐Ÿ‘ค Users

me = mb.get_current_user()               # who am I?
user = mb.get_user(1)                     # specific user
users = mb.list_users()                   # all users (admin only)
pymetabase whoami
pymetabase users

๐Ÿ—„๏ธ Tables & Fields

Explore your database schema without SQL:

tables = mb.list_tables("Production")     # all tables in a database
table = mb.get_table(1)                   # table details
metadata = mb.get_table_metadata(1)       # columns, types, FKs
field = mb.get_field(42)                  # single field info
values = mb.get_field_values(42)          # distinct values (great for filters)

๐Ÿงฉ Parameterized Queries

Use {{param}} placeholders in SQL โ€” PyMetabase handles the Metabase template-tag wiring for you:

with Metabase(...) as mb:
    # Filter orders by status and minimum amount
    rows = mb.query(
        "SELECT * FROM orders WHERE status = {{status}} AND total > {{min_total}}",
        database="Production",
        parameters={"status": "shipped", "min_total": 500},
    )

    # Export with parameters โ€” same syntax
    mb.export(
        database="Production",
        query="SELECT * FROM signups WHERE created_at > {{since}}",
        output="recent_signups.jsonl",
        parameters={"since": "2024-01-01"},
    )
# CLI โ€” pass parameters with -p key=value
pymetabase query -d Production \
    -q "SELECT * FROM orders WHERE status = {{status}}" \
    -p status=shipped

pymetabase export -d Production \
    -q "SELECT * FROM users WHERE role = {{role}}" \
    -o admins.jsonl -p role=admin

๐Ÿ“ค Export

Large Dataset Export

Automatic chunking for datasets with millions of rows. No memory issues, no timeouts:

with Metabase(...) as mb:
    result = mb.export(
        database="Production",
        query="SELECT * FROM events",       # 10M+ rows? No problem
        output="events.jsonl",
        chunk_size=500_000,                  # fetch 500K rows at a time
        checkpoint_file="progress.json",     # resume if interrupted
        progress_callback=lambda cur, total, rate: print(f"{cur:,}/{total:,} rows ({rate:.0f}/s)"),
    )
    print(f"Done โ€” {result.total_rows:,} rows, {result.chunks} chunks, {result.duration_seconds:.0f}s")

How it works under the hood:

  1. Counts total rows with SELECT COUNT(*)
  2. Wraps your query with ROW_NUMBER() OVER () and fetches in offset chunks
  3. Retries failed chunks with exponential backoff
  4. Saves progress to a checkpoint file so you can resume after crashes
  5. Works with any SQL โ€” CTEs, JOINs, subqueries, window functions

Export Formats

Output format is detected from the file extension:

pymetabase export -d DB -q "SELECT * FROM users" -o users.jsonl   # JSON Lines (default)
pymetabase export -d DB -q "SELECT * FROM users" -o users.json    # JSON array
pymetabase export -d DB -q "SELECT * FROM users" -o users.csv     # CSV

Table Export with Filtering

Export a table directly โ€” no SQL needed:

with Metabase(...) as mb:
    mb.export_table(
        database="Production",
        table="users",
        output="active_users.csv",
        columns=["id", "name", "email", "plan"],
        where="active = true AND plan = 'pro'",
        order_by="created_at DESC",
        limit=10_000,
    )

๐ŸŒ Remote Management

Manage multiple Metabase servers like rclone manages cloud storage:

# Configure a remote (interactive)
pymetabase config

# List all configured remotes
pymetabase listremotes
# Configured remotes (2):
#   - production (default): https://metabase.company.com
#   - staging: https://staging-metabase.company.com

# Use a specific remote for any command
pymetabase -r staging export -d DB -q "SELECT 1" -o test.jsonl

# Manage remotes
pymetabase testremote production     # verify connection
pymetabase showremote production     # show config details
pymetabase selectremote staging      # change default
pymetabase deleteremote old-server   # remove a remote

# Quick identity check
pymetabase whoami

๐Ÿ”‘ Session tokens are persisted โ€” PyMetabase caches your auth token and reuses it across runs. No re-login on every command.

Config location:

  • macOS / Linux: ~/.config/pymetabase/config.json
  • Windows: %APPDATA%\pymetabase\config.json

โš™๏ธ Configuration

PyMetabase supports multiple configuration methods. Highest priority wins:

1. Direct Parameters

mb = Metabase(url="https://metabase.company.com", username="analyst@company.com", password="...")

2. Remotes (recommended for CLI)

pymetabase config           # interactive setup
pymetabase -r myserver export ...

3. Credentials File

[{"SERVER_NAME": "https://metabase.company.com", "USERNAME": "analyst@company.com", "PASSWORD": "..."}]
mb = Metabase(credentials_file="credentials.json")

4. Config File (YAML)

metabase:
  url: https://metabase.company.com
  username: analyst@company.com
  password: secret

defaults:
  chunk_size: 500000
  format: jsonl

retry:
  max_retries: 3
  delay: 1.0
  timeout: 600

5. Environment Variables

export METABASE_URL=https://metabase.company.com
export METABASE_USERNAME=analyst@company.com
export METABASE_PASSWORD=secret

๐Ÿ“˜ CLI Reference

Command Description
export Export SQL query results to file
export-table Export a table with optional column/filter selection
query Execute SQL and print results (-f table|json|jsonl|csv)
cards List all saved questions
card <id> Show card details
execute-card <id> Run a saved question (with optional -p params)
dashboards List all dashboards
dashboard <id> Show dashboard details and its cards
collections List all collections
collection <id> Show items inside a collection
search <query> Search cards, dashboards, collections, tables
users List all users (admin only)
whoami Show current authenticated user
list-databases List available databases
list-tables List tables in a database
api <method> <endpoint> Raw API request (GET/POST/PUT/DELETE)
config Add, edit, or delete remotes
listremotes List configured remotes
selectremote Set the default remote
showremote <name> Show remote configuration
deleteremote <name> Delete a remote
testremote [name] Test connection to a remote

๐Ÿ“˜ Python API Reference

Metabase Client

Metabase(
    url: str = None,
    username: str = None,
    password: str = None,
    config_file: str = None,
    credentials_file: str = None,
    remote_name: str = None,     # for token persistence
    persist_token: bool = True,  # reuse sessions across runs
)

Methods

Category Methods
Generic API get(endpoint) post(endpoint, **kwargs) put(endpoint, **kwargs) delete_resource(endpoint)
Cards list_cards() get_card(id) create_card(...) update_card(id, ...) delete_card(id) execute_card(id, parameters=)
Dashboards list_dashboards() get_dashboard(id) create_dashboard(...) update_dashboard(id, ...) delete_dashboard(id)
Collections list_collections() get_collection(id) get_collection_items(id) create_collection(...)
Search search(query, models=, limit=)
Users get_current_user() list_users() get_user(id)
Tables/Fields list_tables(db) get_table(id) get_table_metadata(id) get_field(id) get_field_values(id)
Queries query(sql, database, parameters=)
Export export(query, output, database, parameters=, chunk_size=, ...) export_table(table, output, ...)
Connection connect() disconnect()

ExportResult

result.total_rows         # int โ€” total rows exported
result.chunks             # int โ€” number of chunks used
result.duration_seconds   # float โ€” wall-clock time
result.output_file        # str โ€” path to output file
result.rate_per_second    # float โ€” rows/sec throughput
result.format             # str โ€” "jsonl", "json", or "csv"

๐Ÿ› ๏ธ Requirements

  • Python 3.9+
  • requests sqlglot pyyaml

๐Ÿค Contributing

Contributions are welcome! Please open an issue or submit a pull request.

git clone https://github.com/mokarimi19/pymetabase.git
cd pymetabase
pip install -e ".[dev]"
pytest

๐Ÿ“„ License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymetabase-0.1.1.tar.gz (125.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pymetabase-0.1.1-py3-none-any.whl (33.5 kB view details)

Uploaded Python 3

File details

Details for the file pymetabase-0.1.1.tar.gz.

File metadata

  • Download URL: pymetabase-0.1.1.tar.gz
  • Upload date:
  • Size: 125.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.11

File hashes

Hashes for pymetabase-0.1.1.tar.gz
Algorithm Hash digest
SHA256 eb0cbf5f9c59ac00817099d01681a8238fcafcbdc4c1f20759fa1dfd138b9aaf
MD5 770e75ad2180c90ecbaebdf1da3b0926
BLAKE2b-256 27b4b08aadd4ec19f9d2aa61eff5bfbfad304667f2d54c06dca14c4c5ede9b55

See more details on using hashes here.

File details

Details for the file pymetabase-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pymetabase-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 33.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.11

File hashes

Hashes for pymetabase-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cb1fb6bbc9a7a7f2620604472cbe3ccbe04f7933c86850adc375f30cfaf31878
MD5 cacb0916507a65fb32f997b8b7b95d60
BLAKE2b-256 dba0492d7307c5fc575794250ec3113ebeb114ac8869165eadcedcdf0a46b075

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page