Natural language to typed Polars query plans
Project description
polars-nlq
polars-nlq turns natural language questions into a typed Polars query plan, then executes that plan against a DataFrame or LazyFrame.
The library is built around two functions:
nl_query(...)to generate a validatedPlanwithinstructorexecute_plan(...)to execute thatPlanwith Polars
Warning: plans created by LLMs can be incorrect and should be reviewed by a human before use.
Install
pip install polars-nlq
Quick Example
import instructor
import polars as pl
from openai import OpenAI
from polars_nlq import execute_plan, nl_query
# Columns: name,city,sales
q1 = pl.scan_csv("sales.csv")
# Local OpenAI-compatible endpoint
openai_client = OpenAI(base_url="http://localhost:8080/v1", api_key="dummy")
client = instructor.from_openai(openai_client)
plan = nl_query(client, q1.collect_schema(), "sum of sales by city, with at least 20 sales")
results_lf = execute_plan(q1, plan) # execute_plan always returns a LazyFrame
print(results_lf.collect())
Plan Model
Plans are typed with Pydantic models and validated before execution.
- Expressions:
col,lit,unary,binary,func,when_then_otherwise - Operations:
select,with_columns,filter,groupby_agg,sort,limit
This gives you a clear contract between LLM output and execution, and plans can be serialized and reused.
Example plan from the query above:
ops=[GroupByAgg(op='groupby_agg', by=[Col(kind='col', name='city')], maintain_order=False, named_by={}, aggs=[NamedExpr(expr=Func(kind='func', name='sum', args=[Col(kind='col', name='sales')]), alias='total_sales')], named_aggs={}), Filter(op='filter', predicate=Binary(kind='binary', op=<BinaryOp.GTE: 'gte'>, left=Col(kind='col', name='total_sales'), right=Lit(kind='lit', value=20))), Select(op='select', exprs=[NamedExpr(expr=Col(kind='col', name='city'), alias=None), NamedExpr(expr=Col(kind='col', name='total_sales'), alias=None)])]
API
nl_query(client, schema, question, model="local-model") -> Plan
client: instructor-wrapped client that supportschat.completions.createschema: mapping-like schema (for exampleLazyFrame.collect_schema())question: natural language promptmodel: model name passed tochat.completions.create(defaults to"local-model")
Returns a validated Plan.
execute_plan(source, plan) -> pl.LazyFrame
source:pl.DataFrameorpl.LazyFrameplan:Planinstance or compatible dict
Returns collected query results as a pl.LazyFrame.
Run Tests
uv run pytest -q
Limitations
- Derived columns are not supported in plans (for example, grouping by year from a date column).
- Joins are not supported.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polars_nlq-0.3.0.tar.gz.
File metadata
- Download URL: polars_nlq-0.3.0.tar.gz
- Upload date:
- Size: 95.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
177420b71dad65baf6304b1d06fba5eccfcb2910c627a53cc3d19904b760c5ae
|
|
| MD5 |
fdeb4c429aadfe4e74a37b74bbab8bb6
|
|
| BLAKE2b-256 |
a09d39a30f36699b557a8d398f73b982ea39eb7a6572839b856b51419e56a31d
|
File details
Details for the file polars_nlq-0.3.0-py3-none-any.whl.
File metadata
- Download URL: polars_nlq-0.3.0-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2efc0a5af4406f7b5aca3327669a6e0fc4b338760f1920223366b516487a2734
|
|
| MD5 |
8dc25c7c56a92a0623669df8213bcc6d
|
|
| BLAKE2b-256 |
9d260ccddee8942c8e6343267a844ad4dbcf5acc305f515b9a396881939f4a96
|