Skip to main content

No project description provided

Project description

Bundlebase

Like Docker, but for data.

Documentation | PyPI | Issues

Features

  • Multiple Formats: Support for Parquet, CSV, JSON, and more
  • Version Control: Built-in commit system for data pipeline versioning
  • Python Native: Seamless async/sync Python API with type hints
  • High Performance: Rust-powered core with Apache Arrow columnar format
  • Fluent API: Chain operations with intuitive, readable syntax

Installation

pip install bundlebase

Quick Start

Async API

import bundlebase

# Create a new bundle and chain operations
c = await (bundlebase.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

# Convert to pandas
df = await c.to_pandas()

# Commit changes
await c.commit("Cleaned customer data")

Sync API

import bundlebase.sync as dc

# Same operations, no await needed
c = (dc.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

df = c.to_pandas()
c.commit("Cleaned customer data")

Streaming Large Datasets

Process data larger than RAM efficiently:

import bundlebase

# Stream batches instead of loading everything
c = await bundlebase.open("huge_dataset.parquet")

total_rows = 0
async for batch in bundlebase.stream_batches(c):
    # Each batch is ~100MB, not entire dataset
    total_rows += batch.num_rows
    # Memory is freed after each iteration

print(f"Processed {total_rows} rows")

Core Operations

Data Loading

c = await bundlebase.create()
c = c.attach("data.parquet")      # Parquet files
c = c.attach("data.csv")          # CSV files
c = c.attach("data.json")         # JSON files

Data Transformation

c = c.filter("active = true")              # Filter rows
c = c.select(["id", "name", "email"])      # Select columns
c = c.remove_column("temp_field")          # Remove columns
c = c.rename_column("old", "new")          # Rename columns
c = c.select("SELECT * FROM self WHERE ...") # SQL queries

Data Export

df = await c.to_pandas()    # → pandas DataFrame
df = await c.to_polars()    # → polars DataFrame
arr = await c.to_numpy()    # → NumPy array
data = await c.to_dict()    # → Python dict

Indexing

c = c.create_index("email")        # Create index for fast lookups
c = c.rebuild_index("email")       # Rebuild existing index

Joining

c = await bundlebase.create()
c = c.attach("customers.parquet")
c = c.join(
    "orders.parquet",
    left_on="customer_id",
    right_on="id",
    join_type="inner"
)

Development

Prerequisites

  • Rust (latest stable)
  • Python 3.9+
  • Poetry

Setup

# Install Python dependencies
poetry install

# Build Rust extension
maturin develop

# Run tests
cargo test              # Rust tests
poetry run pytest       # Python tests

Contributing

Contributions are welcome!

License

Distributed under the Apache 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bundlebase-0.10.0.tar.gz (799.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bundlebase-0.10.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

bundlebase-0.10.0-cp314-cp314-win_amd64.whl (65.8 MB view details)

Uploaded CPython 3.14Windows x86-64

bundlebase-0.10.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

bundlebase-0.10.0-cp314-cp314-macosx_11_0_arm64.whl (57.6 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

bundlebase-0.10.0-cp313-cp313-win_amd64.whl (65.8 MB view details)

Uploaded CPython 3.13Windows x86-64

bundlebase-0.10.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

bundlebase-0.10.0-cp313-cp313-macosx_11_0_arm64.whl (57.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

bundlebase-0.10.0-cp312-cp312-win_amd64.whl (65.8 MB view details)

Uploaded CPython 3.12Windows x86-64

bundlebase-0.10.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

bundlebase-0.10.0-cp312-cp312-macosx_11_0_arm64.whl (57.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

bundlebase-0.10.0-cp311-cp311-win_amd64.whl (65.8 MB view details)

Uploaded CPython 3.11Windows x86-64

bundlebase-0.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

bundlebase-0.10.0-cp311-cp311-macosx_11_0_arm64.whl (57.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

bundlebase-0.10.0-cp310-cp310-win_amd64.whl (65.8 MB view details)

Uploaded CPython 3.10Windows x86-64

bundlebase-0.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

bundlebase-0.10.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

bundlebase-0.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file bundlebase-0.10.0.tar.gz.

File metadata

  • Download URL: bundlebase-0.10.0.tar.gz
  • Upload date:
  • Size: 799.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bundlebase-0.10.0.tar.gz
Algorithm Hash digest
SHA256 ebbd97ac871ccd952d88e7ade5f3a8368e24b7551e0337c6d9e33be6cb40d608
MD5 7dffe3f44e4e1c2ec1286ecb58d5acdf
BLAKE2b-256 8f40ee80ab5fa486d465b422cb8e27132f2245ac7f3f6455ccbef1320d5d8e2b

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a9ea71aa1e30b31530970ed1d629d0bfb605a76e62cd8665b0a9ea956573d399
MD5 10c0ccafae2a4b959fbb3bcdd153a8b4
BLAKE2b-256 6355e79c962f546953b82d9a098e3941c1fd37a0c4df2635e15971bec65a45a3

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 5fa4fb6241be7ade8f8fd7d0643b42bd967b61520b6040479fc9f8d94604f025
MD5 764403c8d23ff466dcc43964b5e9a727
BLAKE2b-256 e9290a1fb9252655aeabea94b9acab599ed420482c9688e39eb169916f107d15

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cb50aab92bbd6092d3b5d9becd0049701fc810db63fd8b1ed8398eac7f7dc748
MD5 69abf1657255133186bab9821a41ab9f
BLAKE2b-256 2ad89d220f0216a1c8667cd84d29b51034707212a73d3ea9339e926219518a62

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e0ca7e193a7135bca31bbad633ad1b1c979caf2f0f084cb81211627066568c8c
MD5 21c4a04de416959eda25c09301d4ac35
BLAKE2b-256 6633ab1f0471ece082080e9b8bfa9836c134cde13e3aa3b016f4ce8592985c89

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 6069fa92e9abf35afde14f2e340054ebfc8cf96a866da89e328bd946e89f664b
MD5 ed4bea05568f8653e670520cfb74794a
BLAKE2b-256 0c2f73a53a79e973d441db225ae501082e107059f0be2f0ade4305d766b1fc3e

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8df40db33bdf4089b607f7e0d28943edeb357f175d9e3678bf5409788728c778
MD5 f83c4944771737d2f7435db930d93eec
BLAKE2b-256 ac6d0cbbe6bc9207c5159ccc30dc950b7ef77224f6b288b1cc1a053bb98be9b2

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6ddcf43eb926b5901b6871b04d5aa4220175d8701eb6574c20f7a9752b13f47b
MD5 363163c340b1e72eb40180d989fa6b52
BLAKE2b-256 3dbe0350d614c756770b90c41b1b9b6afb2a9141bb71189719d5a9083215f00c

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 9a223c1ea7ad0b44c77bf1963a5ffeb6809b72969f3ac7b2683a224acbcb50ea
MD5 14538b6827be121b721d0015d3a88eb9
BLAKE2b-256 9b8a52f05712604428b1bbb8674feb54dd5b9a321c963a60a7863e395f7a4925

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f414ed405c6828b3b0a099596e61204203b98f9124e9fe28220647faf3d0df10
MD5 b538edb0a710affbcd878bd5dbaacd87
BLAKE2b-256 0fd5bc792afecc8d3759d1a53648c39e703aac61a24cf710ff922a2cb0e41e37

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 34223588d435767f7fcf91d62d6067ce695ef9a497d044f35001dc94a69d0981
MD5 50fe383002a683a34aeecd4c01db0774
BLAKE2b-256 593f9cd9927ee43f9283ed9d752db2148575c7b4f06f858d6d0a4706063927a0

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 d2b10e20722fe3a4b8d31acfabd0c4d0dca1b8bf1b6befae1a39950eaf0f8f28
MD5 a5b5cd853ba7e862d58488cd6c3989f8
BLAKE2b-256 92e1e99d1e273a9207bfd3c4ee6577a833a9596b4a82a9c16eb9e1fccbf40420

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0ee7b6a4cce831972d79e5a456433a0982291b7b904d6be08450ca8acb6d60e9
MD5 f77f2f0e19e2b6362d4e57b805ab648b
BLAKE2b-256 773d6b1a167712f7b019c5ee67625cbb32607323156ffc6df33c9e3a6108bb4e

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 00089166242314eed3aee3e0c20b6e1ae0c4d768ef8a9f7bdb54d7b08fa1cc81
MD5 105c57ff51b6c3326a2f483477b101a1
BLAKE2b-256 fc2ea4ee1b5dace62917c2f72498d78fce13cb9d36398cfb16ffc722f26be073

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e713f35909b3f7c434a82bc0f1fcd0ab8c325e220c32c54ec08a2074c6679d16
MD5 be26fae0cf0095d4a7d9738c5505a89d
BLAKE2b-256 8b85aa85e62ba024c6ec1564b3e05517bee7309c3562168cdb64a0a70d0ad98f

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b0340962a91e63501a6174e334b956f1c910ecc46b3f9d3fce7bb77481f7cc89
MD5 e4ec5afe7efd8b126d44e0648a77babc
BLAKE2b-256 5393bab5857933088515e044ec5c0ed044b94c8c201c7fc60f1fe6b4044d1cfb

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b0b20740f5df941030062a48c1b264a8ed831f79ad79fd5d7b5847d885ff5c80
MD5 c9dbe1323a7b1d5c8454eb72cedecf8c
BLAKE2b-256 dfb819e8d66d79159234cfeb76a98cd8a33f80512e3f90574d63808b089a3e45

See more details on using hashes here.

File details

Details for the file bundlebase-0.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 846f00b77a1dbc114e3e72410d2e86a6ba92b37af8d3c2f3f961e390df95b860
MD5 1b5ac7a89e24fc802d599be9410c4e78
BLAKE2b-256 79d91645958a28cd074e0548793ebe83097454387b73cfd6ac2131960197bcdc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page