Skip to main content

A SQL-based Python dataframe library for ergonomic interactive data analysis and exploration.

Project description

Duckboat

GitHub | Docs | PyPI

Unsightly to some, but gets the job done.

Duckboat is a SQL-based Python dataframe library for ergonomic interactive data analysis and exploration.

pip install git+https://github.com/ajfriend/duckboat

Duckboat allows you to chain SQL snippets (often omitting select * and from ...) to incrementally and lazily build up complex queries.

Duckboat is a light wrapper around the DuckDB relational API, which is easily accessible if you'd like to use DuckDB more directly. Expressions are evaluated lazily and optimized by DuckDB, so queries are fast, avoiding materializing intermediate tables and data transfers.

import duckboat as uck

csv = 'https://raw.githubusercontent.com/allisonhorst/palmerpenguins/main/inst/extdata/penguins.csv'

uck.Table(csv).do(
    "where sex = 'female' ",
    'where year > 2008',
    'select *, cast(body_mass_g as double) as grams',
    'select species, island, avg(grams) as avg_grams group by 1,2',
    'select * replace (round(avg_grams, 1) as avg_grams)',
    'order by avg_grams',
)
┌───────────┬───────────┬───────────┐
│  species  │  island   │ avg_grams │
│  varchar  │  varchar  │  double   │
├───────────┼───────────┼───────────┤
│ Adelie    │ Torgersen │    3193.8 │
│ Adelie    │ Dream     │    3357.5 │
│ Adelie    │ Biscoe    │    3446.9 │
│ Chinstrap │ Dream     │    3522.9 │
│ Gentoo    │ Biscoe    │    4786.3 │
└───────────┴───────────┴───────────┘

Philosophy

This approach results in a mixture of Python and SQL that, I think, is semantically very similar to Google's Pipe Syntax for SQL: We can leverage our existing knowledge of SQL, while making a few small changes to make it more ergonomic and composable.

When doing interactive data analysis, I find this approach easier to read and write than fluent APIs (like in Polars or Ibis) or typical Pandas code. If some operation is easier in other libraries, Duckboat makes it straightforward translate between them, either directly or through Apache Arrow.

Feedback

I'd love to hear any feedback on the approach here, so feel free to reach out through Issues or Discussions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duckboat-0.12.0.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duckboat-0.12.0-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file duckboat-0.12.0.tar.gz.

File metadata

  • Download URL: duckboat-0.12.0.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for duckboat-0.12.0.tar.gz
Algorithm Hash digest
SHA256 626302d3e65cfbe18ca976aaa215c647ef0ddbb459e4eb2a5f3f6e04e373e46a
MD5 18a7f6f4aa5be05bb22de733b1f97cdc
BLAKE2b-256 bb622bba8f2207679d5e223ccd27776d066acd9a8cd708e2a2d64ded174fd799

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckboat-0.12.0.tar.gz:

Publisher: pypi_publish.yml on ajfriend/duckboat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file duckboat-0.12.0-py3-none-any.whl.

File metadata

  • Download URL: duckboat-0.12.0-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for duckboat-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a6e3b396dafd4b9115d716f132762cfd303abd2f035c81bca150bcd69e727e42
MD5 1c20222f519efe2d634a33df0fee4df7
BLAKE2b-256 728c2c3cb96e350efdccc0b0b2e3e2b25285e045c6d8e562da7c4d8ac8466492

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckboat-0.12.0-py3-none-any.whl:

Publisher: pypi_publish.yml on ajfriend/duckboat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page