Memory-mapped columnar binary format for fast random-access I/O on structured arrays.
Project description
ColStore
A memory-mapped columnar binary format for fast, memory-efficient I/O on
structured arrays. colstore lets you write a tabular dataset to a single
.cstore file once and then load arbitrary row/column subsets without
materializing the rest. Internally, columns are stored back-to-back as raw
NumPy bytes, reads use np.memmap, and fancy-index gathers run through a
parallel C++ kernel (OpenMP + software prefetching) bound via Cython. Process
memory stays bounded by the size of the output you ask for; the source file
is never fully read into RAM.
Install
pip install colstore
Building from source needs a C++17 compiler and CMake ≥ 3.18. On macOS install
libomp (brew install libomp) to get the parallel kernel; without it the
build still succeeds but the kernel runs single-threaded.
Quick start
from colstore import ColStore
# Write and open in one call. `.cstore` is the canonical extension.
ds = ColStore.from_dataframe(df, "data.cstore")
# Indexing returns lazy views; no data is read yet.
ds['price'] # ColumnView
ds[100:200] # TableView
ds[100:200, 'price'] # ColumnView
ds[100:200, ['price', 'qty']] # TableView
ds[[1, 5, 9], ['price', 'qty']] # TableView (fancy rows + cols)
# Materialize through one of the to_* methods.
ds['price'].to_array() # 1D ndarray
ds[indices, ['price', 'qty']].to_dict() # dict of 1D arrays
ds[indices, ['price', 'qty']].to_record() # structured ndarray
ds[indices, ['price', 'qty']].to_dataframe() # pandas DataFrame
Writing from other sources
from colstore import ColStore
import numpy as np
# From a dict of 1D arrays.
ColStore.from_dict(
{"x": np.arange(100, dtype=np.float32), "y": np.arange(100, dtype=np.int64)},
"data.cstore",
)
# From a structured (record) array.
records = np.empty(100, dtype=[("price", np.float32), ("qty", np.int32)])
ColStore.from_records(records, "data.cstore")
Each factory returns an opened ColStore ready to read from.
Configuration
from colstore import set_max_workers, set_default_madvise, set_default_backend
set_max_workers(8) # parallel gathers across columns
set_default_madvise("sequential") # OS read-ahead hint for sorted-index reads
set_default_backend("cpp") # gather kernel: cpp | numpy | numba
On-disk format
[magic 8B = b"CSTORE\x00\x01"]
[manifest_len 8B (u64 little-endian)]
[manifest_json]
[zero-padding to 64-byte alignment]
[column_0 raw bytes][column_1 raw bytes]...[column_n raw bytes]
The manifest is a small JSON object recording format_version, n_rows,
and per-column {name, dtype}. Column dtypes are preserved byte-for-byte;
columns are stored back-to-back with no per-row overhead.
Supported dtypes
Fixed-size only: float32, float64, int8/16/32/64, uint8/16/32/64,
bool. Object dtype (strings, Python objects) is rejected at write time —
the design point is zero-copy random access, which requires a fixed stride.
Layout
colstore/
├── pyproject.toml # scikit-build-core build
├── CMakeLists.txt # Cython + C++ build
├── include/colstore/
│ └── gather.hpp # public C++ header
├── src/
│ ├── cpp/gather.cpp # OpenMP + prefetch kernel
│ ├── cython/_gather.pyx # dtype-dispatched binding
│ └── colstore/ # Python package
│ ├── __init__.py
│ ├── config.py
│ ├── format.py
│ ├── kernels.py
│ ├── view.py # ColumnView + TableView
│ └── store.py
└── tests/ # pytest suite
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file colstore-0.1.0.tar.gz.
File metadata
- Download URL: colstore-0.1.0.tar.gz
- Upload date:
- Size: 36.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f5c1ad5c852e699d72ac2a419dc334b4d772b401b415e4a355be4bb6bd51853
|
|
| MD5 |
98df2dbfc1ab35747f8efe6817181072
|
|
| BLAKE2b-256 |
d3e373b82580e556ed16424d9eef0eb82975287f6b04d628f559093fe1dd7f9f
|
Provenance
The following attestation bundles were made for colstore-0.1.0.tar.gz:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0.tar.gz -
Subject digest:
3f5c1ad5c852e699d72ac2a419dc334b4d772b401b415e4a355be4bb6bd51853 - Sigstore transparency entry: 1679820363
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: colstore-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 122.0 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0079e3b40a941b22a1ddc19fcb90213ca0d8937cfe29b9614c24222b9a79f037
|
|
| MD5 |
539bfe9f8ae5baee668469f809923a7d
|
|
| BLAKE2b-256 |
5fcd2759d85c713db3e14542ea6bd67f5eddec0b35674b8613852c6ff2c438ac
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
0079e3b40a941b22a1ddc19fcb90213ca0d8937cfe29b9614c24222b9a79f037 - Sigstore transparency entry: 1679820851
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: colstore-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 42.2 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
636f8c3b40c056367e8629fbf7d3c071df5d47f520c9e0e210d8d3bddee97e57
|
|
| MD5 |
d93f5780acf35161cebd52a83acdfba2
|
|
| BLAKE2b-256 |
5db7945d61059526945f2703ec4c22efb603075079305a13c6be35525a6c65cf
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp313-cp313-macosx_11_0_arm64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp313-cp313-macosx_11_0_arm64.whl -
Subject digest:
636f8c3b40c056367e8629fbf7d3c071df5d47f520c9e0e210d8d3bddee97e57 - Sigstore transparency entry: 1679820475
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: colstore-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 121.8 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2e3e18e42316229d987ac717d6b7d647d004ba86d9929bb509a56f0ac2ac7eb
|
|
| MD5 |
01043d30c2597eef3ed18f9b3a10377f
|
|
| BLAKE2b-256 |
99f46933ca0bfbd66bcbeadeed79f14746fc1a028b62c626f615f459a4f500ad
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
c2e3e18e42316229d987ac717d6b7d647d004ba86d9929bb509a56f0ac2ac7eb - Sigstore transparency entry: 1679820770
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: colstore-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 42.4 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17b3981d1fe39f79e33f9d44522a0f2158def90d83ba8cce478c444506f2685e
|
|
| MD5 |
6cf3190c5ed1578ab87415d3cc00cbd0
|
|
| BLAKE2b-256 |
1cc81579c2d074a64a5373c1d3374822c783da094a76d06df9a80b79f5b368b2
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp312-cp312-macosx_11_0_arm64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp312-cp312-macosx_11_0_arm64.whl -
Subject digest:
17b3981d1fe39f79e33f9d44522a0f2158def90d83ba8cce478c444506f2685e - Sigstore transparency entry: 1679821049
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: colstore-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 122.4 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43021d0a09f745a807bc6d15fdf372c2f06199cdf21f519dcc869d282dba4902
|
|
| MD5 |
d19863b9bbbbb55e346ccf676893ab56
|
|
| BLAKE2b-256 |
a0ded44348577f17a4a6f781878b4a77de7bf8c3728640ae4c1ad644f2f57002
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
43021d0a09f745a807bc6d15fdf372c2f06199cdf21f519dcc869d282dba4902 - Sigstore transparency entry: 1679820604
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: colstore-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 42.6 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
673c1209ef0f6afbdbf598d229c579d98dc99d20f9a62b6998b63427f7776364
|
|
| MD5 |
8c61961837fc81da5b422555b20f74d9
|
|
| BLAKE2b-256 |
4bbc96873f5c444b9d23d6ff6360c0814f5939bc137cbb44e176a78a1e6188a3
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp311-cp311-macosx_11_0_arm64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp311-cp311-macosx_11_0_arm64.whl -
Subject digest:
673c1209ef0f6afbdbf598d229c579d98dc99d20f9a62b6998b63427f7776364 - Sigstore transparency entry: 1679821140
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: colstore-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 122.5 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bee1f715e5d501ae41acf5090d3aa982a185dd7d3827a9a3658fd3a5c253e4d
|
|
| MD5 |
60516a0d9bce621f6d76b88d061f5a6f
|
|
| BLAKE2b-256 |
1b9bfcd3ddae86f2d1a85678386bd63de128f7f6c67d642131debe713ecb9c7e
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
6bee1f715e5d501ae41acf5090d3aa982a185dd7d3827a9a3658fd3a5c253e4d - Sigstore transparency entry: 1679820701
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file colstore-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: colstore-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 42.8 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ee8e44bd47f4a18e7b44da59d567bb91d370ad231f0931ab12137c49817649d
|
|
| MD5 |
7e5c0361bc1b97d6744f6d881492fac0
|
|
| BLAKE2b-256 |
d52fcb153b407a004e3cc49ec4e47801153ac8cbef6563423a73879158263ca6
|
Provenance
The following attestation bundles were made for colstore-0.1.0-cp310-cp310-macosx_11_0_arm64.whl:
Publisher:
release.yml on AlkaidCheng/colstore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
colstore-0.1.0-cp310-cp310-macosx_11_0_arm64.whl -
Subject digest:
4ee8e44bd47f4a18e7b44da59d567bb91d370ad231f0931ab12137c49817649d - Sigstore transparency entry: 1679820950
- Sigstore integration time:
-
Permalink:
AlkaidCheng/colstore@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AlkaidCheng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@71b0534afc6d7b92ad9fdbaa27816a965d9867b4 -
Trigger Event:
release
-
Statement type: