Skip to main content

A SQL-based Python dataframe library for ergonomic interactive data analysis and exploration.

Project description

Duckboat

GitHub | Docs | PyPI

Unsightly to some, but gets the job done.

Duckboat is a SQL-based Python dataframe library for ergonomic interactive data analysis and exploration.

pip install git+https://github.com/ajfriend/duckboat

Duckboat allows you to chain SQL snippets (often omitting select * and from ...) to incrementally and lazily build up complex queries.

Duckboat is a light wrapper around the DuckDB relational API, which is easily accessible if you'd like to use DuckDB more directly. Expressions are evaluated lazily and optimized by DuckDB, so queries are fast, avoiding materializing intermediate tables and data transfers.

import duckboat as uck

csv = 'https://raw.githubusercontent.com/allisonhorst/palmerpenguins/main/inst/extdata/penguins.csv'

uck.Table(csv).do(
    "where sex = 'female' ",
    'where year > 2008',
    'select *, cast(body_mass_g as double) as grams',
    'select species, island, avg(grams) as avg_grams group by 1,2',
    'select * replace (round(avg_grams, 1) as avg_grams)',
    'order by avg_grams',
)
┌───────────┬───────────┬───────────┐
│  species  │  island   │ avg_grams │
│  varchar  │  varchar  │  double   │
├───────────┼───────────┼───────────┤
│ Adelie    │ Torgersen │    3193.8 │
│ Adelie    │ Dream     │    3357.5 │
│ Adelie    │ Biscoe    │    3446.9 │
│ Chinstrap │ Dream     │    3522.9 │
│ Gentoo    │ Biscoe    │    4786.3 │
└───────────┴───────────┴───────────┘

Philosophy

This approach results in a mixture of Python and SQL that, I think, is semantically very similar to Google's Pipe Syntax for SQL: We can leverage our existing knowledge of SQL, while making a few small changes to make it more ergonomic and composable.

When doing interactive data analysis, I find this approach easier to read and write than fluent APIs (like in Polars or Ibis) or typical Pandas code. If some operation is easier in other libraries, Duckboat makes it straightforward translate between them, either directly or through Apache Arrow.

Feedback

I'd love to hear any feedback on the approach here, so feel free to reach out through Issues or Discussions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duckboat-0.13.0.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duckboat-0.13.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file duckboat-0.13.0.tar.gz.

File metadata

  • Download URL: duckboat-0.13.0.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for duckboat-0.13.0.tar.gz
Algorithm Hash digest
SHA256 69786bead62ffea2cd4abbd60521c4ae57eea516a2ab60a263bff86d2c4dbb13
MD5 0d6e1153cbb7d75b66c467bc80dc6941
BLAKE2b-256 7a4e522d29835e136983739434c397c78dcce55e40c0cd062f72bab8e4242c91

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckboat-0.13.0.tar.gz:

Publisher: pypi_publish.yml on ajfriend/duckboat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file duckboat-0.13.0-py3-none-any.whl.

File metadata

  • Download URL: duckboat-0.13.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for duckboat-0.13.0-py3-none-any.whl
Algorithm Hash digest
SHA256 895dfcc22c68fbbbde114b44b6e93208090651452057827d227525880694a0b0
MD5 16e55774f1f74b1f9493b635ef75bdae
BLAKE2b-256 dadf6ff95a02902e5c4e9f7844b73eb9aa163c9cccb8cd89997b28eb9d5ae62f

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckboat-0.13.0-py3-none-any.whl:

Publisher: pypi_publish.yml on ajfriend/duckboat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page