Query databases using Polars syntax — the Python equivalent of R's dbplyr
Project description
polars-db
A Python library that lets you query relational databases using Polars syntax.
Write expressions like pdb.col("age") > 30 and polars-db translates them into
the appropriate SQL dialect for your backend.
Motivation
In Python, there is no standard way to manipulate database tables using DataFrame syntax. You either write raw SQL strings or learn a library-specific API like SQLAlchemy or Ibis — none of which feel like the DataFrame code you already write. R solved this years ago with dbplyr, which lets dplyr code run transparently against databases. polars-db brings the same idea to the Polars ecosystem.
| Library | API style | Target audience |
|---|---|---|
| SQLAlchemy | SQLAlchemy's own API | General Python → SQL |
| Ibis | Ibis's own API | General DataFrame → SQL |
| SQLFrame | PySpark API | PySpark users → SQL |
| polars-db | Polars API | Polars users → SQL |
Zero learning cost — If you already know Polars, you can query databases immediately. No new API to learn.
Type-safe unified pipelines — Local processing (Polars) and database queries (polars-db) share the same API, eliminating the implicit type coercion, NaN/None confusion, and dtype mismatches that plague pandas-based workflows.
Installation
pip install polars-db
DuckDB and SQLite work out of the box. For other databases, install the corresponding extras:
# PostgreSQL
pip install polars-db[postgres]
# MySQL
pip install polars-db[mysql]
# SQL Server
pip install polars-db[sqlserver]
# BigQuery
pip install polars-db[bigquery]
# Multiple backends at once
pip install polars-db[postgres,mysql]
Supported Databases
| Database | Extras | Driver | Connection string example |
|---|---|---|---|
| DuckDB | (none) | duckdb | duckdb:///:memory: |
| SQLite | (none) | sqlite3 (stdlib) | sqlite:///path/to/db.sqlite |
| PostgreSQL | postgres |
psycopg2 | postgresql://user:pass@host:5432/dbname |
| MySQL | mysql |
PyMySQL | mysql://user:pass@host:3306/dbname |
| SQL Server | sqlserver |
pymssql | mssql://user:pass@host:1433/dbname |
| BigQuery | bigquery |
google-cloud-bigquery | bigquery://project/dataset |
Usage
Connecting
import polars_db as pdb
conn = pdb.connect("postgresql://user:pass@localhost:5432/mydb")
SELECT Queries
Build queries with the same API as Polars LazyFrame, then call collect() to
execute.
# Basic select / filter / sort / limit
df = (
conn.table("users")
.filter(pdb.col("age") > 30)
.select("name", "age")
.sort("age", descending=True)
.limit(10)
.collect()
)
# GROUP BY with aggregation
df = (
conn.table("sales")
.group_by("product_id")
.agg(
pdb.col("amount").sum().alias("total"),
pdb.col("id").count().alias("num_sales"),
)
.sort("total", descending=True)
.collect()
)
# JOIN
users = conn.table("users")
orders = conn.table("orders")
df = (
users.join(orders, on="user_id", how="left")
.select("name", "amount")
.collect()
)
# Window functions
df = (
conn.table("sales")
.with_columns(
pdb.col("amount").sum().over("dept").alias("dept_total"),
)
.collect()
)
# Cumulative sum with ordering
df = (
conn.table("sales")
.with_columns(
pdb.col("amount")
.cum_sum()
.over("dept", order_by="date")
.alias("running_total"),
)
.collect()
)
# CASE WHEN
df = (
conn.table("users")
.with_columns(
pdb.when(pdb.col("age") >= 18)
.then(pdb.lit("adult"))
.otherwise(pdb.lit("minor"))
.alias("category"),
)
.collect()
)
Inspecting Generated SQL
query = conn.table("users").filter(pdb.col("age") > 30)
print(query.show_query())
# SELECT * FROM users WHERE age > 30
DDL / Raw SQL
The query builder focuses on SELECT statements. For DDL (CREATE TABLE, etc.)
and DML (INSERT, etc.), use execute_raw():
conn.execute_raw("CREATE TABLE users (id INT, name TEXT, age INT)")
conn.execute_raw("INSERT INTO users VALUES (1, 'Alice', 30)")
conn.execute_raw("DROP TABLE IF EXISTS users")
Note:
execute_raw()executes SQL as-is. Never pass unsanitized external input via string concatenation.
Development
Prerequisites
- Python 3.13+
- uv
- Docker (for integration tests)
Setup
git clone https://github.com/b-trout/polars-db.git
cd polars-db
uv sync --all-groups --all-extras
Common Commands
This project uses Poe the Poet as a task runner.
# Lint & format
uv run poe lint # ruff check
uv run poe format # ruff format
uv run poe type-check # ty check
# Unit tests (no database required)
uv run poe test-unit
# Integration tests (per backend)
docker compose up -d
POLARS_DB_TEST_BACKEND=postgres uv run poe test-integration
# Run all pre-commit checks (format -> lint -> type-check -> yaml -> docker-lint)
uv run poe pre-commit
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polars_db-0.1.0.tar.gz.
File metadata
- Download URL: polars_db-0.1.0.tar.gz
- Upload date:
- Size: 56.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10056f7516b281d9717baa6cbf0c8e7478bb2d79a9fddd99b5ff8097bb0bef73
|
|
| MD5 |
a4846edd2ae963ff63ad70f4e7ee763c
|
|
| BLAKE2b-256 |
38ef4bb958e9e14d4aadf6a25e784fec8be1d4f00fe97438b7f68619e7e4ffc3
|
Provenance
The following attestation bundles were made for polars_db-0.1.0.tar.gz:
Publisher:
release.yml on b-trout/polars-db
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_db-0.1.0.tar.gz -
Subject digest:
10056f7516b281d9717baa6cbf0c8e7478bb2d79a9fddd99b5ff8097bb0bef73 - Sigstore transparency entry: 1334818004
- Sigstore integration time:
-
Permalink:
b-trout/polars-db@8be1648e6df4d9b193a42bc6292bee693e17acba -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/b-trout
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@8be1648e6df4d9b193a42bc6292bee693e17acba -
Trigger Event:
push
-
Statement type:
File details
Details for the file polars_db-0.1.0-py3-none-any.whl.
File metadata
- Download URL: polars_db-0.1.0-py3-none-any.whl
- Upload date:
- Size: 35.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfdef67b8061a88b2c9bdff356a8b41caa6be8b9f7780dded11e16a12c6eca16
|
|
| MD5 |
f3cd1a1029dc91d855acd3db28e6d511
|
|
| BLAKE2b-256 |
5a5a509e9c884b8a3c1a7fac24d15d662ec6be1a793b2cf13b9e1eda19517f44
|
Provenance
The following attestation bundles were made for polars_db-0.1.0-py3-none-any.whl:
Publisher:
release.yml on b-trout/polars-db
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_db-0.1.0-py3-none-any.whl -
Subject digest:
bfdef67b8061a88b2c9bdff356a8b41caa6be8b9f7780dded11e16a12c6eca16 - Sigstore transparency entry: 1334818125
- Sigstore integration time:
-
Permalink:
b-trout/polars-db@8be1648e6df4d9b193a42bc6292bee693e17acba -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/b-trout
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@8be1648e6df4d9b193a42bc6292bee693e17acba -
Trigger Event:
push
-
Statement type: