Skip to main content

No project description provided

Project description

Bundlebase

Like Docker, but for data.

Documentation | PyPI | Issues

Features

  • Multiple Formats: Support for Parquet, CSV, JSON, and more
  • Version Control: Built-in commit system for data pipeline versioning
  • Python Native: Seamless async/sync Python API with type hints
  • High Performance: Rust-powered core with Apache Arrow columnar format
  • Fluent API: Chain operations with intuitive, readable syntax

Installation

pip install bundlebase

Quick Start

Async API

import bundlebase

# Create a new bundle and chain operations
c = await (bundlebase.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

# Convert to pandas
df = await c.to_pandas()

# Commit changes
await c.commit("Cleaned customer data")

Sync API

import bundlebase.sync as dc

# Same operations, no await needed
c = (dc.create()
    .attach("data.parquet")
    .filter("age >= 18")
    .remove_column("ssn")
    .rename_column("fname", "first_name"))

df = c.to_pandas()
c.commit("Cleaned customer data")

Streaming Large Datasets

Process data larger than RAM efficiently:

import bundlebase

# Stream batches instead of loading everything
c = await bundlebase.open("huge_dataset.parquet")

total_rows = 0
async for batch in bundlebase.stream_batches(c):
    # Each batch is ~100MB, not entire dataset
    total_rows += batch.num_rows
    # Memory is freed after each iteration

print(f"Processed {total_rows} rows")

Core Operations

Data Loading

c = await bundlebase.create()
c = c.attach("data.parquet")      # Parquet files
c = c.attach("data.csv")          # CSV files
c = c.attach("data.json")         # JSON files

Data Transformation

c = c.filter("active = true")              # Filter rows
c = c.select(["id", "name", "email"])      # Select columns
c = c.remove_column("temp_field")          # Remove columns
c = c.rename_column("old", "new")          # Rename columns
c = c.select("SELECT * FROM self WHERE ...") # SQL queries

Data Export

df = await c.to_pandas()    # → pandas DataFrame
df = await c.to_polars()    # → polars DataFrame
arr = await c.to_numpy()    # → NumPy array
data = await c.to_dict()    # → Python dict

Indexing

c = c.create_index("email")        # Create index for fast lookups
c = c.rebuild_index("email")       # Rebuild existing index

Joining

c = await bundlebase.create()
c = c.attach("customers.parquet")
c = c.join(
    "orders.parquet",
    left_on="customer_id",
    right_on="id",
    join_type="inner"
)

Development

Prerequisites

  • Rust (latest stable)
  • Python 3.9+
  • Poetry

Setup

# Install Python dependencies
poetry install

# Build Rust extension
maturin develop

# Run tests
cargo test              # Rust tests
poetry run pytest       # Python tests

Contributing

Contributions are welcome!

License

Distributed under the Apache 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bundlebase-0.7.0.tar.gz (523.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bundlebase-0.7.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

bundlebase-0.7.0-cp314-cp314-win_amd64.whl (48.6 MB view details)

Uploaded CPython 3.14Windows x86-64

bundlebase-0.7.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

bundlebase-0.7.0-cp314-cp314-macosx_11_0_arm64.whl (49.6 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

bundlebase-0.7.0-cp313-cp313-win_amd64.whl (48.6 MB view details)

Uploaded CPython 3.13Windows x86-64

bundlebase-0.7.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

bundlebase-0.7.0-cp313-cp313-macosx_11_0_arm64.whl (49.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

bundlebase-0.7.0-cp312-cp312-win_amd64.whl (48.6 MB view details)

Uploaded CPython 3.12Windows x86-64

bundlebase-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

bundlebase-0.7.0-cp312-cp312-macosx_11_0_arm64.whl (49.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

bundlebase-0.7.0-cp311-cp311-win_amd64.whl (48.6 MB view details)

Uploaded CPython 3.11Windows x86-64

bundlebase-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

bundlebase-0.7.0-cp311-cp311-macosx_11_0_arm64.whl (49.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

bundlebase-0.7.0-cp310-cp310-win_amd64.whl (48.6 MB view details)

Uploaded CPython 3.10Windows x86-64

bundlebase-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

bundlebase-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.1 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

bundlebase-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (55.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file bundlebase-0.7.0.tar.gz.

File metadata

  • Download URL: bundlebase-0.7.0.tar.gz
  • Upload date:
  • Size: 523.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.7.0.tar.gz
Algorithm Hash digest
SHA256 c78688ebc1d84953b0a129a7c2e913f118bf483280a0d812924ddba7508ebea5
MD5 f1e2d16b74dc65ef47b343803ba86196
BLAKE2b-256 90ea2c99d0083bd3e932d10377b591f0253dc06944708ab08ca72d43110b3c90

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 22d548871db79ccc0828dc592a6969e0ad4f0e792cefe31d3843dd9195706e6a
MD5 49d5d6c9f91a81a06151135374f2d259
BLAKE2b-256 49843ca2268ee8eb6430d53a883fa292043e0bb776158ceac9408951831c232f

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.7.0-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 48.6 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.7.0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 cf26a65075a619bb832f83cee677d4f791d313df55ef390ead1864dc69690f3a
MD5 f7b711fd63d726df336cc96f5ec64e7d
BLAKE2b-256 219d5efddaa10982e692a8a08de2e2e89235c7a8b0c11338a5e30bc00fc6b269

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5517e1a0d3a084417a691f5acdf0c579cf0588c4a5553dcee0084da049932198
MD5 1a9e1f91d4de994f38a091e2b5b385c1
BLAKE2b-256 bffa46cd00d61bc17e5ce24f46e261b4f84a40d4b20fbd9f2cad0ac6fb8d0ab6

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 895d102ba3746c90828b9de2f2b8595276ab1ae6186072570b53bd3301ee185f
MD5 2e1b695a607e588213f216dc93578c7b
BLAKE2b-256 f96daf0462e6b1352da930ad90f2cee604e1d0b1c9b55e2c6d92b4c3116dc65a

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.7.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 48.6 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.7.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 3373da1c3eb8018934aedc6500c254b0984d9a65093b5760102475cda92ed785
MD5 270f77ea4207c1166b7a1cb326ef5bde
BLAKE2b-256 f34837875536c0b357c9e62aeb5776e051427b5a1a3d6131be4d6068e27049b9

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5215079395e8ba24ac885441c9149483a2a099709bf02b4c0b3f98a83529f0f9
MD5 87e9d530ae8fa5d2237c3e68e76350e9
BLAKE2b-256 6c8b43c5a1ddab1dfd00029b7d959e6c6ef1c43a578555e436d81064e2e1dc54

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ff0bff9ed29e3d19d2cba9ef947838ea7abf43af8d163da5d198d03ccd6c32f0
MD5 11d5bf15a7ada1bf7c3a24e5c614b33c
BLAKE2b-256 4a512fa43efebf1e3aeaca7d4298c2a1efe0541c6394f223450eb1a7c75952ae

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.7.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 48.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.7.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 9ba8da7a42e15d59dd0d52dc3199d2ef97f4e072aed723860cd1c0e3edd33dfe
MD5 1739a9e30cf4879ef6d83a58743d85b5
BLAKE2b-256 70284b9df64f0cbd9061eb465bcb37a16101357551bbed08e0f8e269d32604b4

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 04538d4523f86cdd9e1422a6cad7720ff4b129bbe3d0bdc32a0da8e63603739a
MD5 b9d705607ab0d2279db9ec86b6c09c6a
BLAKE2b-256 d4ed8dfa8a2207be339af52d53c7bb94a8a9db63ea14aa85eedbe1a0b7f2132c

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a6fa67efcf09eee1db9599b0c9b1ed2d3459e9966376fc2dac02f53f174d9315
MD5 84845b202074039c01bb7bf5d68ba1b4
BLAKE2b-256 38c2f7ac929536032a3a0483c3457992ebc8ca082867ed77db8ce70df0b5a50b

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.7.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 48.6 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.7.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 28ade55a33547a082e9c86d6ff91fca7f56f9c875b6613bca6cb4fed5f8f9545
MD5 917165fa902ab99009ac83a3e3ab844c
BLAKE2b-256 3ee663aa16161cb122599973e8c1fd8011c6307fffe4614bf2c1c135688726fc

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cef231cd17acb511697d18ee1a7cf913eb3c0164fa02f8d36bf4399aff2e1f8e
MD5 1fdd4e98a31016d31f74e0cd179792cc
BLAKE2b-256 cd93d2586b1c0317b125ea907dc6699ad96c479945bb92fd6725da8331f87db9

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c3d3b591338f58616fd07ffa48725df8eb6cd70640c5db3db26cd15c2dbf9a14
MD5 6460b7e032782630c7beaded0f2aad5c
BLAKE2b-256 5ae5a64a9600a6938296cfc164dd74587ce49b9f9c2e4053fb148b4c41608939

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: bundlebase-0.7.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 48.6 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bundlebase-0.7.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c2892f007d2219f2be6ad96350a79d34f3d9965ac68b368627d83661384b5a7e
MD5 32fecb95a743f255e0035fe02745b139
BLAKE2b-256 39c1f9d7f86079e2b3a2cd1c778f86835be00d52fc83a3036c216c50bef892a1

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bac6fd09124843c61f02a4b325202f85abd9a6884e520c859775b46b5bb39e2e
MD5 07a8db674810bdea86b8c4fa4b74f420
BLAKE2b-256 2573faf22a285c6bf5797be320c8822471a39ad0f3f734e12043e1f3f9e74d86

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1638ac8f363149a8af225b7987bf5afcac23b16d4c3cc1af3b3de38b1134ecb9
MD5 c04bd9dab6dc53a492f922e821c69a4a
BLAKE2b-256 6a022a058956ab922a99b5be6ac91552f0689b21752cec58da780673882b440f

See more details on using hashes here.

File details

Details for the file bundlebase-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bundlebase-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1e09725fcb089d71def823307142f58a4eb5734f935a3f19c9529d63203c6abd
MD5 84f01905bf4c7418007fa99ec2ddb8c5
BLAKE2b-256 11e09a569e09b542a60d168295d0e7314f5d7af1a1bf14b2c94ff5f11b0aa080

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page