Skip to main content

Natural language to typed Polars query plans

Project description

polars-nlq

polars-nlq turns natural language questions into a typed Polars query plan, then executes that plan against a DataFrame or LazyFrame.

The library is built around two functions:

  • nl_query(...) to generate a validated Plan with instructor
  • execute_plan(...) to execute that Plan with Polars

Warning: plans created by LLMs can be incorrect and should be reviewed by a human before use.

Install

pip install polars-nlq

Quick Example

import instructor
import polars as pl
from openai import OpenAI

from polars_nlq import execute_plan, nl_query

# Columns: name,city,sales
q1 = pl.scan_csv("sales.csv")

# Local OpenAI-compatible endpoint
openai_client = OpenAI(base_url="http://localhost:8080/v1", api_key="dummy")
client = instructor.from_openai(openai_client)

plan = nl_query(client, q1.collect_schema(), "sum of sales by city, with at least 20 sales")

results_lf = execute_plan(q1, plan)  # execute_plan always returns a LazyFrame

print(results_lf.collect())

Plan Model

Plans are typed with Pydantic models and validated before execution.

  • Expressions: col, lit, unary, binary, func, when_then_otherwise
  • Operations: select, with_columns, filter, groupby_agg, sort, limit

This gives you a clear contract between LLM output and execution, and plans can be serialized and reused.

Example plan from the query above:

ops=[GroupByAgg(op='groupby_agg', by=[Col(kind='col', name='city')], maintain_order=False, named_by={}, aggs=[NamedExpr(expr=Func(kind='func', name='sum', args=[Col(kind='col', name='sales')]), alias='total_sales')], named_aggs={}), Filter(op='filter', predicate=Binary(kind='binary', op=<BinaryOp.GTE: 'gte'>, left=Col(kind='col', name='total_sales'), right=Lit(kind='lit', value=20))), Select(op='select', exprs=[NamedExpr(expr=Col(kind='col', name='city'), alias=None), NamedExpr(expr=Col(kind='col', name='total_sales'), alias=None)])]

API

nl_query(client, schema, question, model="local-model") -> Plan

  • client: instructor-wrapped client that supports chat.completions.create
  • schema: mapping-like schema (for example LazyFrame.collect_schema())
  • question: natural language prompt
  • model: model name passed to chat.completions.create (defaults to "local-model")

Returns a validated Plan.

execute_plan(source, plan) -> pl.LazyFrame

  • source: pl.DataFrame or pl.LazyFrame
  • plan: Plan instance or compatible dict

Returns collected query results as a pl.LazyFrame.

Run Tests

uv run pytest -q

Limitations

  • Derived columns are not supported in plans (for example, grouping by year from a date column).
  • Joins are not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_nlq-0.3.0.tar.gz (95.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_nlq-0.3.0-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file polars_nlq-0.3.0.tar.gz.

File metadata

  • Download URL: polars_nlq-0.3.0.tar.gz
  • Upload date:
  • Size: 95.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.3.0.tar.gz
Algorithm Hash digest
SHA256 177420b71dad65baf6304b1d06fba5eccfcb2910c627a53cc3d19904b760c5ae
MD5 fdeb4c429aadfe4e74a37b74bbab8bb6
BLAKE2b-256 a09d39a30f36699b557a8d398f73b982ea39eb7a6572839b856b51419e56a31d

See more details on using hashes here.

File details

Details for the file polars_nlq-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: polars_nlq-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2efc0a5af4406f7b5aca3327669a6e0fc4b338760f1920223366b516487a2734
MD5 8dc25c7c56a92a0623669df8213bcc6d
BLAKE2b-256 9d260ccddee8942c8e6343267a844ad4dbcf5acc305f515b9a396881939f4a96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page