Skip to main content

Analyze everything, for agents. Personal data analysis engine with DuckDB.

Project description

maestro-analyze

Analyze everything, for agents.

Python 3.11+ License: MIT GitHub stars

Import any dataset, profile it, query it with SQL, and generate 25+ chart types -- all from the command line. DuckDB backend. Zero config. Built for AI agents and humans who prefer terminals over dashboards.


Quickstart

pip install maestro-analyze
$ manalyze import sales.csv
  Imported: sales (12,847 rows, 9 columns)

$ manalyze profile sales
  sales -- 12,847 rows x 9 columns
  Revenue is right-skewed (median $420, mean $1,230)
  3 outliers detected in profit_margin (> 3 sigma)
  Missing: 2.1% in region column

$ manalyze chart sales --type bar --x region --y revenue
  Chart saved: charts/sales.html

$ manalyze query "SELECT region, SUM(revenue) FROM sales GROUP BY region ORDER BY 2 DESC"

Features

  • DuckDB backend -- import CSV, Excel, JSON, Parquet into a local analytical database. SQL everything.
  • Auto-profiling -- descriptive stats, distribution analysis, anomaly detection, missing data audit.
  • 25 chart types -- interactive Plotly charts and pure SVG charts. Auto-selects type when you omit --type.
  • Trend analysis -- time series aggregation with configurable frequency (daily, weekly, monthly, quarterly, yearly).
  • Comparison analysis -- group-by metrics with statistical insights.
  • Plugin architecture -- drop a .py file in ~/.maestro/analyst/plugins/charts/ and it auto-registers.
  • Agent-native -- every command returns structured output. Designed as a tool for AI coding agents.

Chart Types

Plotly (interactive HTML/PNG/SVG)

Type Description
bar Bar chart
line Line chart
scatter Scatter plot
histogram Histogram
pie Pie chart
box Box plot
bubble Bubble chart (size + color dimensions)
sankey Sankey flow diagram
treemap Hierarchical treemap
wordcloud Word cloud
radar Radar / spider chart
heatmap Correlation heatmap
distribution Distribution histogram with KDE
funnel Funnel chart
table Formatted data table
event_band Line chart with event period bands
stacked_area Stacked area composition
bland_altman Agreement / validation plot
coverage_matrix Data completeness heatmap
trust_radar Multi-axis scoring radar

SVG (pure, no JavaScript dependency)

Type Description
bump Ranking trajectories over time
heatmap_grid 2D categorical grid heatmap
lollipop Horizontal lollipop chart
event_timeline Vertical event timeline
slope Period comparison slope chart
sparkline Tiny inline chart

CLI Reference

Import data

manalyze import data.csv                          # auto-detect table name from filename
manalyze import data.xlsx --table quarterly        # custom table name
manalyze import data.parquet --db project.duckdb   # custom database file

List tables

manalyze tables

Profile a table

manalyze profile sales                            # auto-insights + descriptive stats

SQL queries

manalyze query "SELECT * FROM sales WHERE revenue > 1000" --limit 100

Generate charts

manalyze chart sales --type bar --x region --y revenue
manalyze chart sales --type scatter --x cost --y revenue --color region
manalyze chart sales --type bubble --x cost --y revenue --size volume --color region
manalyze chart sales --type sankey --source origin --target destination --values volume
manalyze chart sales --type treemap --names category --parents parent --values revenue
manalyze chart sales --type bump --x year --y rank --color entity
manalyze chart sales --sql "SELECT region, SUM(revenue) r FROM sales GROUP BY 1" --type bar --x region --y r
manalyze chart sales --type line --format png      # export as PNG instead of HTML
manalyze chart sales --type bar --fragment          # output <div> fragment for embedding

Trend analysis

manalyze trend sales --date order_date --metric revenue              # monthly by default
manalyze trend sales --date order_date --metric revenue --freq W     # weekly

Comparison analysis

manalyze compare sales --group region --metrics "revenue,profit,volume"

Export

manalyze export sales                              # CSV by default
manalyze export sales --format parquet             # Parquet
manalyze export sales --format json -o out.json    # JSON to specific path

Drop a table

manalyze drop old_data

Python SDK

from maestro_analyze.core.store import Store
from maestro_analyze.engine.analyzer import Analyzer
from maestro_analyze.engine.charts import make_chart, save_chart

# Import and query
with Store("workspace.duckdb") as store:
    store.import_file("sales.csv")
    df = store.query("SELECT region, SUM(revenue) as rev FROM sales GROUP BY 1")

# Auto-profile
with Store("workspace.duckdb") as store:
    analyzer = Analyzer(store)
    result = analyzer.profile("sales")
    for insight in result.insights:
        print(f"{insight.title}: {insight.detail}")

# Generate charts programmatically
fig = make_chart(df, "bar", x="region", y="rev", title="Revenue by Region")
path = save_chart(fig, "revenue_by_region", "html")

Architecture

CLI (typer)  -->  Store (DuckDB)  -->  Analyzer (profiling, trend, compare)
                                  -->  Chart Engine (auto-select or manual)
                                         |
                                  chart_builders/
                                    __init__.py        # auto-discovers via pkgutil
                                    _base.py           # BaseChartBuilder ABC
                                    _svg_base.py       # SVG helpers
                                    bar.py             # one file per chart type
                                    ...                # 25 built-in types
                                  ~/.maestro/analyst/plugins/charts/
                                    my_chart.py        # user plugins (auto-registered)

Adding a custom chart

Drop a .py file in ~/.maestro/analyst/plugins/charts/:

from maestro_analyze.engine.chart_builders._base import BaseChartBuilder

class WaterfallBuilder(BaseChartBuilder):
    name = "waterfall"
    description = "Waterfall chart for incremental changes"

    def build(self, *, df, **kw):
        import plotly.graph_objects as go
        fig = go.Figure(go.Waterfall(x=df[kw["x"]], y=df[kw["y"]]))
        return fig

Available on next run -- no registration code needed.


How it compares

maestro-analyze Metabase Jupyter Apache Superset
Setup pip install Docker + database pip install + kernel Docker + database + Redis
Interface CLI + Python SDK Web UI Notebook Web UI
Agent-friendly Yes (structured CLI output) No (browser-only) Partial (kernel protocol) No (browser-only)
Chart types 25 built-in + plugins ~15 Unlimited (matplotlib) ~30
Data backend DuckDB (embedded) PostgreSQL / MySQL pandas (in-memory) SQLAlchemy
Auto-profiling Built-in No Manual No
Config required None Database, SMTP, secrets None Database, cache, auth
Target user Agents + CLI developers Business analysts Data scientists BI teams

maestro-analyze is not a replacement for full BI platforms. It is a zero-config analysis tool designed to be called by AI agents or used directly from the terminal.


Optional extras

pip install maestro-analyze[ml]       # scikit-learn, scipy (clustering, statistical tests)
pip install maestro-analyze[polars]   # Polars DataFrame support
pip install maestro-analyze[all]      # everything

Development setup

git clone https://github.com/maestro-ai-stack/maestro-analyze.git
cd maestro-analyze
python3.11 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v

Contributing

Core improvements -- open issues and PRs on maestro-ai-stack/maestro-analyze.

Custom chart types -- contribute built-in builders via PR, or distribute your own as plugin files that users drop into ~/.maestro/analyst/plugins/charts/.


License

MIT


Built by Maestro -- Singapore AI product studio.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maestro_analyze-0.2.0.tar.gz (37.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

maestro_analyze-0.2.0-py3-none-any.whl (47.5 kB view details)

Uploaded Python 3

File details

Details for the file maestro_analyze-0.2.0.tar.gz.

File metadata

  • Download URL: maestro_analyze-0.2.0.tar.gz
  • Upload date:
  • Size: 37.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for maestro_analyze-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3c70aaab0fa78deded9e6eb8d68abbd6c0928c515730d191978e0de1a68b0968
MD5 8e018ba0058f68d1de23dbf614864331
BLAKE2b-256 d4e4db84b9796dda3263605996f9f2c4b390f54e6628e79e28d8404e1c1755c2

See more details on using hashes here.

File details

Details for the file maestro_analyze-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for maestro_analyze-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 06b03590a356f9574a921379f0b3d9863f9d64791bf8cea09e1e6b8eaad71061
MD5 d6bd4e1123367d78b01fa538e8ba93b0
BLAKE2b-256 ea7ab881dedcaf1c85cb00b163b0e48764a845b895c24d1eee368535160e2bf7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page