Shared local IO execution layer for DeltaCAT read/write clients.
Project description
deltacat-io-core
deltacat-io-core is the shared local execution layer for DeltaCAT reads and
writes.
It is used by both:
deltacat-clientfor direct thin-plan executiondeltacatfor shared local execution and compatibility wrappers
Naming
- distribution/package name:
deltacat-io-core - Python import module:
deltacat_io_core
The distribution uses dashes for consistency with deltacat-client. The import
module keeps underscores because Python module names cannot contain -.
Scope
deltacat-io-core owns the code that should behave the same regardless of
whether the caller is using the thin client or the thick DeltaCAT package.
Today that includes:
- direct execution of thin
Planobjects - MOR execution for thin and thick paths
- local file materialization and manifest building
- schema alignment and table conversion helpers
- sort-aware file ordering and manifest handling
- shared compaction/MOR helper layers and model types
- format-specific local readers/writers
Non-Goals
deltacat-io-core does not own:
- server routes or REST/MCP request handling
- authoritative catalog/storage mutations
- native Ray job orchestration surfaces
- public end-user API shape for
deltacatordeltacat-client
It is a shared implementation layer, not the top-level user product.
Architecture
The current read architecture is:
- The server resolves a thin
Plan. client.catalog.read(plan=...)executes that plan directly throughdeltacat-io-core.dc.read_table(plan=...)for thin plans also executes through the same shared path.
There is no longer a runtime bridge back into thick DeltaCAT for thin plan execution. The plan contract is expected to carry the metadata required for direct execution.
The current write architecture is:
- The client stages local files or materializes local data through shared helpers.
- The authoritative commit still happens through DeltaCAT server/native boundaries.
- Shared write-preparation and manifest logic lives in
deltacat-io-core.
Installation
Base install:
uv pip install deltacat-io-core
Optional extras:
deltacat-io-core[io]for local file readers/writers (pyarrow,fastavro)deltacat-io-core[pandas]for Pandas conversionsdeltacat-io-core[polars]for Polars conversions and lazy scan helpersdeltacat-io-core[daft]for Daft conversions and lazy scan helpersdeltacat-io-core[lance]for Lance dataset supportdeltacat-io-core[all]for the full local IO stack
Read Capabilities
The shared read executor currently handles:
- schema-table reads
- schemaless manifest-table reads
- MOR reads
- direct
pyarrow,pandas,polars,numpy,daft, andray_datasetoutputs where supported - lazy
pyarrow_parquet - lazy
lance
It also enforces direct validation for unsupported combinations, for example:
- schemaless +
pyarrow_parquet - schemaless +
lance - mixed-content lazy plans for format-specific readers
- unknown content types in the shared path
Polars / Daft Capability Matrix
The shared executor applies the same capability decision in thin
execute_read_plan(...) execution and in thick reads that delegate into
that shared path.
| Engine | Content | v1 behavior |
|---|---|---|
| Polars | Parquet | Lazy scan via pl.scan_parquet(...) when the existing local preconditions hold |
| Polars | Lance | Explicit eager fallback; no reader-level Lance row-filter pushdown |
| Polars | PackDS | Same as Lance; PackDS plans stay on the explicit eager Lance fallback |
| Daft | Parquet | Lazy scan via shared build_daft_lazy_scan(...) when the group is local/shared-eligible |
| Daft | Lance | Lazy only for a single dataset on the shared local path; multi-dataset falls back eagerly |
| Daft | PackDS | Same as Lance under PackDS v5: a single pruned episode dataset can use native lazy Lance scanning; multi-episode plans fall back eagerly |
Notes:
- Mixed-schema lazy eligibility on the shared path requires per-file
schema_idlookups plus top-level schema information with resolvable field types, whether that comes fromschema_serializedor a typed top-levelschemasummary. - On the shared Daft path, non-identity Parquet content encodings (for
example
.parquet.gz) stay on the eager PyArrow path. - When the process is pinned to
DAFT_RUNNER=ray, the shared local Daft lazy path declines and falls back to the eager shared path instead of spawning a Ray-backed local lazy scan.
Write Capabilities
The shared write layer currently covers:
- write input normalization
- local data materialization
- manifest construction for existing files and datasets
- schema/read compatibility helpers
- standard catalog write orchestration slices
Authoritative catalog mutation, commit, retention, and compaction boundaries still remain on the native/server side where they belong.
Relationship To Other Packages
Use deltacat-client when you want the public thin client.
Use deltacat when you want the thick/native package.
Use deltacat-io-core directly only if you are intentionally building against
the shared execution layer itself.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deltacat_io_core-0.1.13.tar.gz.
File metadata
- Download URL: deltacat_io_core-0.1.13.tar.gz
- Upload date:
- Size: 206.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b225fd7fec84c3a85830915faefbdcf434f4110274ed81149973ace896fb9afc
|
|
| MD5 |
f8d4fbda988cd46a6cbdb7d108cd9faf
|
|
| BLAKE2b-256 |
b133bf47d0fce030b09e95c9a1941bbad115191aaad83d4f71c51f48996404c6
|
File details
Details for the file deltacat_io_core-0.1.13-py3-none-any.whl.
File metadata
- Download URL: deltacat_io_core-0.1.13-py3-none-any.whl
- Upload date:
- Size: 244.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5875a2415920b75fcae08dea79391b053d92903431a52fa4144832962066c968
|
|
| MD5 |
13003502c5b47535afc9755f5c4888b3
|
|
| BLAKE2b-256 |
ea1f4afad9d36cbab558c141b14d2d633ef98c1525069241745faf9ea58d253c
|