Skip to main content

Natural language to typed Polars query plans

Project description

polars-nlq

polars-nlq turns natural language questions into a typed Polars query plan, then executes that plan against a DataFrame or LazyFrame.

The library is built around two functions:

  • nl_query(...) to generate a validated Plan with instructor
  • execute_plan(...) to execute that Plan with Polars

Warning: plans created by LLMs can be incorrect and should be reviewed by a human before use.

Install

uv sync

Quick Example

import instructor
import polars as pl
from openai import OpenAI

from polars_nlq import execute_plan, nl_query

# Columns: name,city,sales
q1 = pl.scan_csv("sales.csv")

# Local OpenAI-compatible endpoint
openai_client = OpenAI(base_url="http://localhost:8080/v1", api_key="dummy")
client = instructor.from_openai(openai_client)

plan = nl_query(client, q1.collect_schema(), "sum of sales by city, with at least 20 sales")

results_lf = execute_plan(q1, plan)  # execute_plan always returns a LazyFrame

print(results_lf.collect())

Plan Model

Plans are typed with Pydantic models and validated before execution.

  • Expressions: col, lit, unary, binary, func, when_then_otherwise
  • Operations: select, with_columns, filter, groupby_agg, sort, limit

This gives you a clear contract between LLM output and execution, and plans can be serialized and reused.

Example plan from the query above:

ops=[GroupByAgg(op='groupby_agg', by=[Col(kind='col', name='city')], maintain_order=False, named_by={}, aggs=[NamedExpr(expr=Func(kind='func', name='sum', args=[Col(kind='col', name='sales')]), alias='total_sales')], named_aggs={}), Filter(op='filter', predicate=Binary(kind='binary', op=<BinaryOp.GTE: 'gte'>, left=Col(kind='col', name='total_sales'), right=Lit(kind='lit', value=20))), Select(op='select', exprs=[NamedExpr(expr=Col(kind='col', name='city'), alias=None), NamedExpr(expr=Col(kind='col', name='total_sales'), alias=None)])]

API

nl_query(client, schema, question, model="local-model") -> Plan

  • client: instructor-wrapped client that supports chat.completions.create
  • schema: mapping-like schema (for example LazyFrame.collect_schema())
  • question: natural language prompt
  • model: model name passed to chat.completions.create (defaults to "local-model")

Returns a validated Plan.

execute_plan(source, plan) -> pl.LazyFrame

  • source: pl.DataFrame or pl.LazyFrame
  • plan: Plan instance or compatible dict

Returns collected query results as a pl.LazyFrame.

Run Tests

uv run pytest -q

Limitations

  • Derived columns are not supported in plans (for example, grouping by year from a date column).
  • Joins are not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_nlq-0.1.0.tar.gz (79.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_nlq-0.1.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file polars_nlq-0.1.0.tar.gz.

File metadata

  • Download URL: polars_nlq-0.1.0.tar.gz
  • Upload date:
  • Size: 79.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5f3bcce9d955968fe06812b0d7244af5a04cb13b5219dbdaf5149cb829f07dcc
MD5 32c421ebe30ef69ff656725efb3c8046
BLAKE2b-256 219d18c77ca78caf18443c6fbf6c25a9767194a30368b88654b490cf3ade49cc

See more details on using hashes here.

File details

Details for the file polars_nlq-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: polars_nlq-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc5205285062873eec8224450ca417a8fb3d92a20da0d96f1af11a833761d17b
MD5 3b20fdb7a09c5ffc178bf4bf8923011d
BLAKE2b-256 9d23eb3164b2b067d2c2c06696a908b479b4d3dc235d68171b0c679f0fda6092

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page