Skip to main content

A library to transpile Pandas syntax to SQL strings.

Project description

pdql

Lightweight Python library that allows you to write SQL queries using familiar Pandas syntax. It functions as a "lazy compiler," building a syntax tree from your operations and transpiling them into standard SQL strings without executing them or requiring a database connection.

Installation

Clone the repository and set up the environment using the provided Makefile:

git clone https://github.com/marcinz606/pdql
cd pdql
make setup

Or install using pip:

pip install pdql

Usage

Persistent Dialect & Filtering

from pdql.dataframe import SQLDataFrame
from pdql.dialects import BigQueryDialect

# Initialize with a specific dialect
df = SQLDataFrame("my_table", dialect=BigQueryDialect())

# Filters use dialect-specific quoting (backticks for BigQuery)
query = df[df["age"] > 21]

print(query.to_sql())
# SELECT * FROM `my_table` WHERE (`my_table`.`age` > 21)

Common Table Expressions (CTEs)

from pdql.dataframe import SQLDataFrame

# Define a subquery
sub = SQLDataFrame("raw_data")[["id", "val"]]
sub = sub[sub["val"] > 10]

# Use it as a source and define the CTE
df = SQLDataFrame("filtered").with_cte("filtered", sub)

print(df.to_sql())
# WITH "filtered" AS (SELECT "id", "val" FROM "raw_data" WHERE ("raw_data"."val" > 10)) SELECT * FROM "filtered"

Subqueries & Aliasing

inner = SQLDataFrame("orders").groupby("user_id").agg({"amount": "sum"}).alias("totals")
outer = SQLDataFrame(inner)
query = outer[outer["amount_sum"] > 1000]

print(query.to_sql())
# SELECT * FROM (SELECT "user_id", SUM("amount") AS "amount_sum" FROM "orders" GROUP BY "user_id") AS "totals" WHERE ("totals"."amount_sum" > 1000)

Ordering & Limits

from pdql.expressions import SQLFunction

# Order by columns or expressions/functions
query = df.sort_values(["created_at", SQLFunction("rand")], ascending=[False, True]).head(10)

print(query.to_sql())
# SELECT * FROM "my_table" ORDER BY "my_table"."created_at" DESC, RAND() ASC LIMIT 10

DML Operations

df = SQLDataFrame("users")

# Generate INSERT
insert_sql = df.insert({"name": "Alice", "status": "active"})

# Generate DELETE based on current filters
delete_sql = df[df["status"] == "inactive"].delete()

Development

Use the Makefile for standard tasks:

  • Run Tests: make test
  • Format Code: make format
  • Linting: make lint
  • Build Package: make build

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdql-0.1.1.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdql-0.1.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file pdql-0.1.1.tar.gz.

File metadata

  • Download URL: pdql-0.1.1.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pdql-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d8417b30bdb24e740a6dfbb23f08c218d941f45c60f2640d8f96e04a3cc5414a
MD5 32f394457cf250018f2e2bf89a00bfe2
BLAKE2b-256 c3b8fc600b2d219aff08be6c69c3c4d5ebb1355e850d884e3c700c0836dfb650

See more details on using hashes here.

File details

Details for the file pdql-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pdql-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pdql-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 450a29f96a90a367011458d9bcfd3944da7277325789d02994cf3845f638ae60
MD5 e7ea42e971aadb8b5f3b76d6fadd44dc
BLAKE2b-256 c77983249d562979062e370bb07807aad8486ab4461c0cfe3f980caf5d68b75f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page