Skip to main content

Natural language to typed Polars query plans

Project description

polars-nlq

polars-nlq turns natural language questions into a typed Polars query plan, then executes that plan against a DataFrame or LazyFrame.

The library is built around two functions:

  • nl_query(...) to generate a validated Plan with instructor
  • execute_plan(...) to execute that Plan with Polars

Warning: plans created by LLMs can be incorrect and should be reviewed by a human before use.

Install

pip install polars-nlq

Quick Example

import instructor
import polars as pl
from openai import OpenAI

from polars_nlq import execute_plan, nl_query

# Columns: name,city,sales
q1 = pl.scan_csv("sales.csv")

# Local OpenAI-compatible endpoint
openai_client = OpenAI(base_url="http://localhost:8080/v1", api_key="dummy")
client = instructor.from_openai(openai_client)

plan = nl_query(client, q1.collect_schema(), "sum of sales by city, with at least 20 sales")

results_lf = execute_plan(q1, plan)  # execute_plan always returns a LazyFrame

print(results_lf.collect())

Plan Model

Plans are typed with Pydantic models and validated before execution.

  • Expressions: col, lit, unary, binary, func, when_then_otherwise
  • Operations: select, with_columns, filter, groupby_agg, sort, limit

This gives you a clear contract between LLM output and execution, and plans can be serialized and reused.

Example plan from the query above:

ops=[GroupByAgg(op='groupby_agg', by=[Col(kind='col', name='city')], maintain_order=False, named_by={}, aggs=[NamedExpr(expr=Func(kind='func', name='sum', args=[Col(kind='col', name='sales')]), alias='total_sales')], named_aggs={}), Filter(op='filter', predicate=Binary(kind='binary', op=<BinaryOp.GTE: 'gte'>, left=Col(kind='col', name='total_sales'), right=Lit(kind='lit', value=20))), Select(op='select', exprs=[NamedExpr(expr=Col(kind='col', name='city'), alias=None), NamedExpr(expr=Col(kind='col', name='total_sales'), alias=None)])]

API

nl_query(client, schema, question, model="local-model") -> Plan

  • client: instructor-wrapped client that supports chat.completions.create
  • schema: mapping-like schema (for example LazyFrame.collect_schema())
  • question: natural language prompt
  • model: model name passed to chat.completions.create (defaults to "local-model")

Returns a validated Plan.

execute_plan(source, plan) -> pl.LazyFrame

  • source: pl.DataFrame or pl.LazyFrame
  • plan: Plan instance or compatible dict

Returns collected query results as a pl.LazyFrame.

Run Tests

uv run pytest -q

Limitations

  • Derived columns are not supported in plans (for example, grouping by year from a date column).
  • Joins are not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_nlq-0.2.1.tar.gz (80.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_nlq-0.2.1-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file polars_nlq-0.2.1.tar.gz.

File metadata

  • Download URL: polars_nlq-0.2.1.tar.gz
  • Upload date:
  • Size: 80.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.2.1.tar.gz
Algorithm Hash digest
SHA256 928067bc101388d9a9d510dec0eea59714611d99982da64892a86cc61ec4d50b
MD5 a1c2e00e25ff83d66292259db355856a
BLAKE2b-256 d9609e57e8158cd421b0f052c071e3131072ad0d0036b40462a1dd72e3af96bd

See more details on using hashes here.

File details

Details for the file polars_nlq-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: polars_nlq-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.21

File hashes

Hashes for polars_nlq-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 481e93a9abdbca6f8e1880def91b300b9fa0a8e25a5f719f061bdc89bc6f7ddc
MD5 c4cfb298b93f366d49b5fc0376983ac3
BLAKE2b-256 f89d5d6a37dcbf6d0e7a676e19e43ce2286180fdd97157fb9c35d5ecaa03449c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page