Skip to main content

An experimental data-wrangling library built on DuckDB for pipelining SQL snippets.

Project description

Duckboat

ajfriend.github.io/duckboat | github.com/ajfriend/duckboat

Unsightly to some, but gets the job done.

Duckboat is a SQL-based Python dataframe library for ergonomic interactive data analysis and exploration.

pip install git+https://github.com/ajfriend/duckboat

Duckboat allows you to chain SQL snippets (often omitting select * and from ...) to incrementally and lazily build up complex queries.

Duckboat is a light wrapper around the DuckDB relational API, which is easily accessible if you'd like to use DuckDB more directly. Expressions are evaluated lazily and optimized by DuckDB, so queries are fast, avoiding materializing intermediate tables and data transfers.

import duckboat as uck

csv = 'https://raw.githubusercontent.com/allisonhorst/palmerpenguins/main/inst/extdata/penguins.csv'

uck.Table(csv).do(
    "where sex = 'female' ",
    'where year > 2008',
    'select *, cast(body_mass_g as double) as grams',
    'select species, island, avg(grams) as avg_grams group by 1,2',
    'select * replace (round(avg_grams, 1) as avg_grams)',
    'order by avg_grams',
)
┌───────────┬───────────┬───────────┐
│  species  │  island   │ avg_grams │
│  varchar  │  varchar  │  double   │
├───────────┼───────────┼───────────┤
│ Adelie    │ Torgersen │    3193.8 │
│ Adelie    │ Dream     │    3357.5 │
│ Adelie    │ Biscoe    │    3446.9 │
│ Chinstrap │ Dream     │    3522.9 │
│ Gentoo    │ Biscoe    │    4786.3 │
└───────────┴───────────┴───────────┘

Philosophy

This approach results in a mixture of Python and SQL that, I think, is semantically very similar to Google's Pipe Syntax for SQL: We can leverage our existing knowledge of SQL, while making a few small changes to make it more ergonomic and composable.

When doing interactive data analysis, I find this approach easier to read and write than fluent APIs (like in Polars or Ibis) or typical Pandas code. If some operation is easier in other libraries, Duckboat makes it straightforward translate between them, either directly or through Apache Arrow.

Feedback

I'd love to hear any feedback on the approach here, so feel free to reach out through Issues or Discussions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duckboat-0.11.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duckboat-0.11.0-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file duckboat-0.11.0.tar.gz.

File metadata

  • Download URL: duckboat-0.11.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for duckboat-0.11.0.tar.gz
Algorithm Hash digest
SHA256 5ce9e6b308dc8fa304c30f6f2350054d80af8368d55d88f334dd56be26ba52d4
MD5 0dd605abeeb39f19bac035da16bea826
BLAKE2b-256 aa0f03e17495b2320e6c7a5607836e1c5c0899c68456fc3aabd4d9c461a3a913

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckboat-0.11.0.tar.gz:

Publisher: pypi_publish.yml on ajfriend/duckboat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file duckboat-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: duckboat-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for duckboat-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c0fef59447d38629f54baa674a70353da5c55b57f5b3daf43ec5572d1e899b44
MD5 05eac065d8e2551bd45cacf3442ee658
BLAKE2b-256 f0083e935d79ce33a13693533690d255793b6101e1437edf1856ece901c44643

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckboat-0.11.0-py3-none-any.whl:

Publisher: pypi_publish.yml on ajfriend/duckboat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page