Skip to main content

Natural language to typed Polars query plans

Project description

polars-nlq

polars-nlq turns natural language questions into a typed Polars query plan, then executes that plan against a DataFrame or LazyFrame.

The library is built around two functions:

  • nl_query(...) to generate a validated Plan with instructor
  • execute_plan(...) to execute that Plan with Polars

Warning: plans created by LLMs can be incorrect and should be reviewed by a human before use.

Install

pip install polars-nlq

Quick Example

import instructor
import polars as pl
from openai import OpenAI

from polars_nlq import execute_plan, nl_query

# Columns: name,city,sales
q1 = pl.scan_csv("sales.csv")

# Local OpenAI-compatible endpoint
openai_client = OpenAI(base_url="http://localhost:8080/v1", api_key="dummy")
client = instructor.from_openai(openai_client)

plan = nl_query(client, q1.collect_schema(), "sum of sales by city, with at least 20 sales")

results_lf = execute_plan(q1, plan)  # execute_plan always returns a LazyFrame

print(results_lf.collect())

Plan Model

Plans are typed with Pydantic models and validated before execution.

  • Expressions: col, lit, unary, binary, func, when_then_otherwise
  • Operations: select, with_columns, filter, groupby_agg, sort, limit

This gives you a clear contract between LLM output and execution, and plans can be serialized and reused.

Example plan from the query above:

ops=[GroupByAgg(op='groupby_agg', by=[Col(kind='col', name='city')], maintain_order=False, named_by={}, aggs=[NamedExpr(expr=Func(kind='func', name='sum', args=[Col(kind='col', name='sales')]), alias='total_sales')], named_aggs={}), Filter(op='filter', predicate=Binary(kind='binary', op=<BinaryOp.GTE: 'gte'>, left=Col(kind='col', name='total_sales'), right=Lit(kind='lit', value=20))), Select(op='select', exprs=[NamedExpr(expr=Col(kind='col', name='city'), alias=None), NamedExpr(expr=Col(kind='col', name='total_sales'), alias=None)])]

API

nl_query(client, schema, question, model="local-model") -> Plan

  • client: instructor-wrapped client that supports chat.completions.create
  • schema: mapping-like schema (for example LazyFrame.collect_schema())
  • question: natural language prompt
  • model: model name passed to chat.completions.create (defaults to "local-model")

Returns a validated Plan.

execute_plan(source, plan) -> pl.LazyFrame

  • source: pl.DataFrame or pl.LazyFrame
  • plan: Plan instance or compatible dict

Returns collected query results as a pl.LazyFrame.

Run Tests

uv run pytest -q

Limitations

  • Derived columns are not supported in plans (for example, grouping by year from a date column).
  • Joins are not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_nlq-0.2.0.tar.gz (80.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_nlq-0.2.0-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file polars_nlq-0.2.0.tar.gz.

File metadata

  • Download URL: polars_nlq-0.2.0.tar.gz
  • Upload date:
  • Size: 80.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.2.0.tar.gz
Algorithm Hash digest
SHA256 cb52931f023e474375d809fcb13a77e2390c559e44e639e2281cb90701b4b057
MD5 4da6e432bb1c62494765c5d842e3d901
BLAKE2b-256 2ced950ca542ab93408582e02f3473ba6fe37db8a33773b315011616d3a9a55c

See more details on using hashes here.

File details

Details for the file polars_nlq-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: polars_nlq-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a2935f1dfcaf95c99d989e3deda24459825b7c53ee9723b48ac6191cd623af0
MD5 525f46f9d78f939ce3cd27392b8b505a
BLAKE2b-256 aa516d1a8c063f65a9bca9b116a729cbf53645e87a119c6dcee5f21851a2a90a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page