Skip to main content

No project description provided

Project description

Bundlebase

Like Docker, but for data.

Documentation | PyPI | Issues

Features

  • Multiple Formats: Support for Parquet, CSV, JSON, and more
  • Version Control: Built-in commit system for data pipeline versioning
  • Python Native: Seamless async/sync Python API with type hints
  • High Performance: Rust-powered core with Apache Arrow columnar format
  • Fluent API: Chain operations with intuitive, readable syntax

Installation

pip install bundlebase

Quick Start

Async API

import bundlebase

# Create a new bundle and chain operations
c = await (bundlebase.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

# Convert to pandas
df = await c.to_pandas()

# Commit changes
await c.commit("Cleaned customer data")

Sync API

import bundlebase.sync as dc

# Same operations, no await needed
c = (dc.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

df = c.to_pandas()
c.commit("Cleaned customer data")

Streaming Large Datasets

Process data larger than RAM efficiently:

import bundlebase

# Stream batches instead of loading everything
c = await bundlebase.open("huge_dataset.parquet")

total_rows = 0
async for batch in bundlebase.stream_batches(c):
    # Each batch is ~100MB, not entire dataset
    total_rows += batch.num_rows
    # Memory is freed after each iteration

print(f"Processed {total_rows} rows")

Core Operations

Data Loading

c = await bundlebase.create()
c = c.attach("data.parquet")      # Parquet files
c = c.attach("data.csv")          # CSV files
c = c.attach("data.json")         # JSON files

Data Transformation

c = c.filter("active = true")              # Filter rows
c = c.select(["id", "name", "email"])      # Select columns
c = c.remove_column("temp_field")          # Remove columns
c = c.rename_column("old", "new")          # Rename columns
c = c.select("SELECT * FROM self WHERE ...") # SQL queries

Data Export

df = await c.to_pandas()    # → pandas DataFrame
df = await c.to_polars()    # → polars DataFrame
arr = await c.to_numpy()    # → NumPy array
data = await c.to_dict()    # → Python dict

Indexing

c = c.create_index("email")        # Create index for fast lookups
c = c.rebuild_index("email")       # Rebuild existing index

Joining

c = await bundlebase.create()
c = c.attach("customers.parquet")
c = c.join(
    "orders.parquet",
    left_on="customer_id",
    right_on="id",
    join_type="inner"
)

Development

Prerequisites

  • Rust (latest stable)
  • Python 3.9+
  • Poetry

Setup

# Install Python dependencies
poetry install

# Build Rust extension
maturin develop

# Run tests
cargo test              # Rust tests
poetry run pytest       # Python tests

Contributing

Contributions are welcome!

License

Distributed under the Apache 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bundlebase-0.9.0.tar.gz (743.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bundlebase-0.9.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

bundlebase-0.9.0-cp314-cp314-win_amd64.whl (65.6 MB view details)

Uploaded CPython 3.14Windows x86-64

bundlebase-0.9.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

bundlebase-0.9.0-cp314-cp314-macosx_11_0_arm64.whl (57.3 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

bundlebase-0.9.0-cp313-cp313-win_amd64.whl (65.6 MB view details)

Uploaded CPython 3.13Windows x86-64

bundlebase-0.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

bundlebase-0.9.0-cp313-cp313-macosx_11_0_arm64.whl (57.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

bundlebase-0.9.0-cp312-cp312-win_amd64.whl (65.6 MB view details)

Uploaded CPython 3.12Windows x86-64

bundlebase-0.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

bundlebase-0.9.0-cp312-cp312-macosx_11_0_arm64.whl (57.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

bundlebase-0.9.0-cp311-cp311-win_amd64.whl (65.6 MB view details)

Uploaded CPython 3.11Windows x86-64

bundlebase-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

bundlebase-0.9.0-cp311-cp311-macosx_11_0_arm64.whl (57.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

bundlebase-0.9.0-cp310-cp310-win_amd64.whl (65.6 MB view details)

Uploaded CPython 3.10Windows x86-64

bundlebase-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

bundlebase-0.9.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

bundlebase-0.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file bundlebase-0.9.0.tar.gz.

File metadata

  • Download URL: bundlebase-0.9.0.tar.gz
  • Upload date:
  • Size: 743.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bundlebase-0.9.0.tar.gz
Algorithm Hash digest
SHA256 ad5ea94654ee4c9814651ff789fd4bf33ebb7ce2491a3eb94c6127b8d08818a6
MD5 32fb3869153c9bfaf588f579ca1ceb6d
BLAKE2b-256 62c21752d84c18e2f9cc4593b040c3d0e03d9ff615ccc9d3d5d4d48ca1a30f97

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1d232339ce32e768c827fd234de37f1703ee32b6771bfadc0e2294692c0a4901
MD5 732e840ead1147fdffdc56905ae29ac5
BLAKE2b-256 ec726cb94a157943dec65e3d99e0b72e5883d6f035a87d2af8bbab547a70c978

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.9.0-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 65.6 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bundlebase-0.9.0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 45b8ba79cf0a455a65104a68f3f690d21a9620fceee4e39e211fefd2deba80c8
MD5 f99799971bc531bb6eec7dd2a5c5470f
BLAKE2b-256 af41d9888296b7133a4617759c70b70b7c758d121a41e683ff128b7b0001b2e7

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 68e722e342bf9ec4625c8d0f8541c3ba8b87ed4af0ea9c95c5aff8356530b851
MD5 c79785e830ba8963128811539796c53f
BLAKE2b-256 2576673be3db5eb93625f3056020cb6eff816491f6778fb19135a42c205d5c8a

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 372c964d1bf62683229e2336197a6bdc8b34859f81a6e227887ab3d1b0858690
MD5 de30c428280d398c053c78a74b0959ad
BLAKE2b-256 59d402a5eb578d7d807327f4cf805d6bd0b72dd926cee721bddf9e6051da6e23

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.9.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 65.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bundlebase-0.9.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 b56237f4c4662eb861a48ec4f857f1d014a0c9bb623f180ddd8406960f9f8af2
MD5 b7b25a46df9138fd03c838a2a2d01856
BLAKE2b-256 2a6519953c35b605d76e2a6a9671925b427fa2a03ad392a80719983ae4c1606f

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 58826d2f2188a3d56ade0752f9e391abb10110a8d3a890d891eb5dcbf10b255c
MD5 298478f658820ba6918514df0a4ae24b
BLAKE2b-256 57b033544ed823086e5f466de0491333d6ebebe66915955155e6a5b1c8db4e08

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1aaecb79a2edbb0d3024ef6092a32b7084564f566ca2b6ebb0a3bfd23c5de30e
MD5 9caf638629629ff9db0fda6d76a4ba15
BLAKE2b-256 712b6f0869781fa38c6b769a587fa6c9e16f2270d8132f5c78adb23d47f7e98a

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.9.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 65.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bundlebase-0.9.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 6d8c74a37e47790b16e5251edaaa1caad207ad63486294b42eb5c6308b2a16ba
MD5 447b43100ea3f4b41a8f6aeca77fc4cc
BLAKE2b-256 4e5ac6c28f2825951cf23b843744642c2e7913a3cf812f7ca2880d23f3e82e62

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e91ade4e2e5fe7df85f7dbceb2fbb673386bc628e3d53a36a6bb006370488d0f
MD5 d441abed9b99333b2987d2f3e953669f
BLAKE2b-256 e518eed6f9eba773c2d6d4df20b05884662717c1979290903210715c3ab279ca

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 31bc6826e5dcb7d5d44cc1bb0a9ab59671efd84205cf8faf497f4979de26ef54
MD5 8043ccda3b9eb73b84fe76e6feaa6674
BLAKE2b-256 1e4f1ec515cf4c0040d152b33e0822bddd0344a1a959adbcf5cd7d0ec156688d

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.9.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 65.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bundlebase-0.9.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 8b7803dd6bef7058c3146651ac84c419f941622a6fd4029db23eecefc0188f24
MD5 5acb957f6c9c6fef8d4ecc75c1c67279
BLAKE2b-256 8a185b290cce4d9d02ce11d28b65449f26a7c2f1399cafd49933242ac2230211

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 24bd6784d883ab6e31468fec575ea0d3fb30e63acb2dc17e0a3a9dcb3fa4c22c
MD5 8e85564e0734216da416777416621dfd
BLAKE2b-256 73d593a3a28141666f6bb96f85280ad0fbad0f53fb72dc4ff88ab1ab847c5f1c

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 afa03689aca25e2cf343fc35eda11576a8075a32003be4dc0608e61b95642775
MD5 71c9b0049d65082aeb5f03f3539f8794
BLAKE2b-256 de758c055bbdfbb683c34594f8c379fcff9f27da99333fbcf6b920b848d2911f

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.9.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 65.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bundlebase-0.9.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 467450749bcf257536269aaa615bcaf2bf5034dc585594e432a2ed76c25b9dba
MD5 9f95ea79c7dc4467c9b02720f539b020
BLAKE2b-256 01c998b78f8222006e82a2f0ba5110cb3ccc1fd45ff2e5371b07c5032463249c

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1f7824365834ce32d516918094962282c70a7a46d1002ed2e7b9b0c18b2d6297
MD5 9696e686233f076f0c200eae4b3f91e3
BLAKE2b-256 977c9b6de4941e1ca1246f7bc369478f956a768bdad23f0cfd21dc69346e1a2f

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8dcf495323c98fa633fba55682dafe39304ea307db9bfb0df56518b068536332
MD5 6726165c77bdd69e0dce6d0acc7da95b
BLAKE2b-256 f8d9f7a9d8e6cc7a8b8394529c191eafb8369e2d9133028b283f897b5ae86658

See more details on using hashes here.

File details

Details for the file bundlebase-0.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a185fd67678f7f22411d00a0f4f90cc009974aff26b1a45a50e3eeb5216d41ee
MD5 1a231d70087dd5bd66eab44946d36887
BLAKE2b-256 54f65ac5bd0436b46ba9f82abe439c30aaaa620157d8cd26751bac82d180b3f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page