Skip to main content

No project description provided

Project description

Bundlebase

Like Docker, but for data.

Documentation | PyPI | Issues

Features

  • Multiple Formats: Support for Parquet, CSV, JSON, and more
  • Version Control: Built-in commit system for data pipeline versioning
  • Python Native: Seamless async/sync Python API with type hints
  • High Performance: Rust-powered core with Apache Arrow columnar format
  • Fluent API: Chain operations with intuitive, readable syntax

Installation

pip install bundlebase

Quick Start

Async API

import bundlebase

# Create a new bundle and chain operations
c = await (bundlebase.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

# Convert to pandas
df = await c.to_pandas()

# Commit changes
await c.commit("Cleaned customer data")

Sync API

import bundlebase.sync as dc

# Same operations, no await needed
c = (dc.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

df = c.to_pandas()
c.commit("Cleaned customer data")

Streaming Large Datasets

Process data larger than RAM efficiently:

import bundlebase

# Stream batches instead of loading everything
c = await bundlebase.open("huge_dataset.parquet")

total_rows = 0
async for batch in bundlebase.stream_batches(c):
    # Each batch is ~100MB, not entire dataset
    total_rows += batch.num_rows
    # Memory is freed after each iteration

print(f"Processed {total_rows} rows")

Core Operations

Data Loading

c = await bundlebase.create()
c = c.attach("data.parquet")      # Parquet files
c = c.attach("data.csv")          # CSV files
c = c.attach("data.json")         # JSON files

Data Transformation

c = c.filter("active = true")              # Filter rows
c = c.select(["id", "name", "email"])      # Select columns
c = c.remove_column("temp_field")          # Remove columns
c = c.rename_column("old", "new")          # Rename columns
c = c.select("SELECT * FROM self WHERE ...") # SQL queries

Data Export

df = await c.to_pandas()    # → pandas DataFrame
df = await c.to_polars()    # → polars DataFrame
arr = await c.to_numpy()    # → NumPy array
data = await c.to_dict()    # → Python dict

Indexing

c = c.create_index("email")        # Create index for fast lookups
c = c.rebuild_index("email")       # Rebuild existing index

Joining

c = await bundlebase.create()
c = c.attach("customers.parquet")
c = c.join(
    "orders.parquet",
    left_on="customer_id",
    right_on="id",
    join_type="inner"
)

Development

Prerequisites

  • Rust (latest stable)
  • Python 3.9+
  • Poetry

Setup

# Install Python dependencies
poetry install

# Build Rust extension
maturin develop

# Run tests
cargo test              # Rust tests
poetry run pytest       # Python tests

Contributing

Contributions are welcome!

License

Distributed under the Apache 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bundlebase-0.8.0.tar.gz (633.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bundlebase-0.8.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

bundlebase-0.8.0-cp314-cp314-win_amd64.whl (45.3 MB view details)

Uploaded CPython 3.14Windows x86-64

bundlebase-0.8.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

bundlebase-0.8.0-cp314-cp314-macosx_11_0_arm64.whl (38.1 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

bundlebase-0.8.0-cp313-cp313-win_amd64.whl (45.3 MB view details)

Uploaded CPython 3.13Windows x86-64

bundlebase-0.8.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

bundlebase-0.8.0-cp313-cp313-macosx_11_0_arm64.whl (38.1 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

bundlebase-0.8.0-cp312-cp312-win_amd64.whl (45.3 MB view details)

Uploaded CPython 3.12Windows x86-64

bundlebase-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

bundlebase-0.8.0-cp312-cp312-macosx_11_0_arm64.whl (38.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

bundlebase-0.8.0-cp311-cp311-win_amd64.whl (45.3 MB view details)

Uploaded CPython 3.11Windows x86-64

bundlebase-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

bundlebase-0.8.0-cp311-cp311-macosx_11_0_arm64.whl (38.1 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

bundlebase-0.8.0-cp310-cp310-win_amd64.whl (45.3 MB view details)

Uploaded CPython 3.10Windows x86-64

bundlebase-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

bundlebase-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

bundlebase-0.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.6 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file bundlebase-0.8.0.tar.gz.

File metadata

  • Download URL: bundlebase-0.8.0.tar.gz
  • Upload date:
  • Size: 633.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.8.0.tar.gz
Algorithm Hash digest
SHA256 c6c7e8262b4dc69a862175f20afde75c62ea23eb5f97648817adbd4f561983d2
MD5 4ad5066a404425aea9294a4b05d5b1ae
BLAKE2b-256 6c2b58a1f17ba24a1ef36951b0ff5f554f2fd0969df72995d00323bed9359d95

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0c9d680125a1d44ec26c2486eedc4c98e9fcc21bb1fa9f7c5fd64f1bd6a65670
MD5 b1a06e505e10767d6801594e319bcaeb
BLAKE2b-256 c80ae8b6c629da08c5b10292395fd61c1573b9d713a796988d9b0e333cbeb40b

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.8.0-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 45.3 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.8.0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 14cb3ab602dd4f893c4b43716e34b94e30fb20e30435b150879690f50795cb0c
MD5 d7bf8349b2ecb0907c93aaaa5472c873
BLAKE2b-256 e06e778dd58140edc2be1e0e9b0d8e253dc5a538429e188d61b0e2e5466527fa

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 46408d4d311632a41c6079c95775fc8dc185e62ef922f2d36dd470707d68776e
MD5 bd6d08296e08307927c718e333ac6ae6
BLAKE2b-256 27936903fbc4d83be062e414dbb054930f741534fc4f9acbb8aa10b39a6b5b2b

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2d47962aec3cd46f6321dfb8186d46128ab35f512fb7935509f42701dd7e7f39
MD5 726ecc0c21a31147ac0eff5236251d32
BLAKE2b-256 d9bce3f0b228dd58b9a8abc617b2b62c1b341f0affee14014cdbad294918dd7e

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.8.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 45.3 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.8.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 1dfe90b9ba74000bfbe83c254a4ef21936742fb43a9702b6720d0ebc11fadef3
MD5 1c561215a58823395fae4a5644938545
BLAKE2b-256 15bfaba78d785f260f765df6f5b6c938f907979ed885ee01a6545d30b3697ea0

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d11dbcf4ed31f2334a6fc19a7ef6f6038cfe7053389a932c702f91ebfc5debba
MD5 d4b3f98c78d399afc2c7cf58b3e15ac7
BLAKE2b-256 da2370f4bb1c41f0a82c676fa66f7c169eff81ff59de36349de6f86771df609c

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 893156850f4fa5cedcb9e57dbf95b0bba1f9a05085ff7370c306476d46ecaf19
MD5 ddd700c06b04936ed27e4093047f2d3f
BLAKE2b-256 29b964a09a32c85d0bf5627d942956bb9054247565f0ff116016913d1f5bb56f

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.8.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 45.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.8.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 add599c5f4f1c182350ad70fbb86cf9fef80809b5dd65e54e75e0221e6db4ff4
MD5 116660c503e3acfffcf33c29ee779cee
BLAKE2b-256 72c2396d3010eea5c535ff63abbc61ec0c517170c014aa3fe4d0e2877d911eb1

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8e6e5ea345848cb776e86c342ef4a4ac33696ff8ddb22fb0c4185500546df462
MD5 b239ba2175bfdeafd724c7d93156b7d6
BLAKE2b-256 efcc1153675c386b1bf56a338c70cf54de0db751f1bdf4d43d645c314b974314

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9b685b2d3c7bfb156471017d94ed4a5d1716b1acca42ea1e20c7861ac2fe3817
MD5 09391aae071c259b40e1de3fc5f9fe37
BLAKE2b-256 a9f16ef76dd9afb8dff7df3e4bcc9d4e91fa0ff6e292d767715f1c681ab4322d

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.8.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 45.3 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.8.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 27be631b91afd734b0f9451345588565f85dd9bf0dd5e04dcdfb7c5a718bf999
MD5 62ecdba2fe30920d3459fe4b797c82f5
BLAKE2b-256 64972590bbcee2446fae1825a5c8a3514511786d5803fd252a98ff22a75b0c76

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 730c4b9fde344e07951c379704ec4765920fbbe2277c632d8b2424b541ad9aeb
MD5 9a74547c1ea4a926a042ab2b86ebfcbe
BLAKE2b-256 f4fe048702ab89fb7f22947b17f75122fb38a5740b09639f42c440fe911e3973

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 db895599a85c86abc31fa63202b0cdd2ba139856577ed16dc07c78f2c7eb3f3b
MD5 936947ea2fc1506125c6d698d282b567
BLAKE2b-256 eb6bb36f55153d238ef70ff9f26c262ebe3da687bb9f0b97313abd071c5fcf4d

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.8.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 45.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.8.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 d3dee672d74b236f85fb558802c64cf2b16bb903bab4d9ae1549c3e59b4e710d
MD5 b6039e47718761bcf78b9496842c3846
BLAKE2b-256 41625f89bdeb38c2ec1a409d9477f71c13f75e10d47cf43f464e75bba16f7b5a

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 05dc53aa90b12fedb1abcb832322cd23e17234bba273abd79fd6408958acc01d
MD5 5a77c55dce3aa1eb410c12a2a78f7b5b
BLAKE2b-256 a9409a65e5cbd07eae9b6b3bbe35f8fa0c7a0daa7216109972d0d3d4cf9c4432

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7919eed56260a620afcc20315aff93852d6b0116a7b86fd3ac1afb8143e2b6aa
MD5 33fb4004bd6efa5768cccdde40e62321
BLAKE2b-256 05ed2a4d63921b298c372c2609cbdad452655f9daf3011fbb299425e876cbe5a

See more details on using hashes here.

File details

Details for the file bundlebase-0.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 54ad304e6fd74d4e0639aa75d341f125a1d6d0a3a18db87a728660e4569f48a4
MD5 1869010348abcf75034e218b477b2c26
BLAKE2b-256 2c238eb2a3e7aa5fd7bf35aa5d27092732a92e44593cd826ef3ecc6c6463ae27

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page