KORE Binary Format - High-performance columnar compression with 5-8x compression ratio
Project description
Kore โ Killer Optimized Record Exchange
A high-performance, columnar file format for analytics with cloud storage connectors.
Kore is a Rust-based columnar file format designed for efficient storage and analysis of structured data. It provides zero external dependencies in the base library with optional cloud connectors for AWS S3, Azure Blob Storage, and Google Cloud Storage.
๐ Quick Start
Install (Python)
pip install kore-fileformat
Verify
import kore_fileformat
print(kore_fileformat.__version__) # 1.0.0
Use Rust
[dependencies]
kore_fileformat = { version = "1.0.0", features = ["s3"] }
๐ Documentation
| Guide | Purpose |
|---|---|
| PYTHON_USER_GUIDE.md | Python installation, usage, examples, cloud integration |
| DOCKER_EMULATORS_GUIDE.md | Docker setup, LocalStack, Azurite, GCS emulator testing |
| DOCUMENTATION_INDEX.md | Master index, reading paths by role, feature matrix |
| CI_CD_SECRETS_SETUP.md | GitHub Actions setup, registry secrets, publishing config |
| V1_1_ROADMAP.md | Next release plan, Azure/GCS implementation, timeline |
| PROJECT_COMPLETION_SUMMARY.md | v1.0.0 deliverables, test results, distribution channels |
Start here: DOCUMENTATION_INDEX.md for role-based reading paths.
โจ Features
Core Library
- โ Zero External Dependencies: Lightweight base crate
- โ Columnar Format: Optimized for analytical queries
- โ Compression: Built-in data compression
- โ Multi-Platform: Windows, macOS, Linux, web
Cloud Connectors
- โ AWS S3: Full implementation in v1.0.0
- โณ Azure Blob Storage: Full implementation coming in v1.1.0
- โณ Google Cloud Storage: Full implementation coming in v1.1.0
Language Bindings
- โ Python: PyO3 wheel for Python 3.9-3.12
- โ Java: JNI bindings and Maven package
- โ JavaScript: NAPI module for Node.js
- โณ Go: Coming in v1.2.0
DevOps & CI/CD
- โ GitHub Actions: 10 automated jobs for testing and publishing
- โ Docker Support: Integration tests with emulators
- โ Multi-Registry Publishing: crates.io, PyPI, Maven Central, npm
๐ What's Included in v1.0.0
| Component | Status | Details |
|---|---|---|
| Base Library | โ Production | Columnar format, compression, serialization |
| S3 Connector | โ Production | Read/write to AWS S3 with LocalStack testing |
| Azure Connector | โณ Prepared | Stub implementations, full SDK in v1.1.0 |
| GCS Connector | โณ Prepared | Stub implementations, full SDK in v1.1.0 |
| Python Bindings | โ Production | Wheel installation, PyPI distribution |
| Java Bindings | โ Production | JNI library, Maven Central distribution |
| JavaScript Bindings | โ Production | NAPI addon, npm distribution |
| Integration Tests | โ Complete | 4 comprehensive tests with emulators |
| Documentation | โ Complete | 8 guides, 2000+ lines, 50+ examples |
๐ฏ Use Cases
Data Analytics
Process large datasets efficiently with columnar storage:
import kore_fileformat
# Store analytics data in columnar format for fast queries
Cloud Data Lakes
Store data directly in S3, Azure, or GCS:
from kore_fileformat import S3Reader
reader = S3Reader(region='us-east-1')
data = reader.read_file('my-bucket', 'path/to/data.kore')
Multi-Language Projects
Use Kore from Python, Java, or JavaScript in the same project:
# Python: import kore_fileformat
# Java: import com.kore.cloud.S3Reader;
# JS: const kore = require('kore-fileformat');
๐๏ธ Architecture
Modular Design
kore_fileformat/
โโโ core/ # Base library (zero dependencies)
โโโ cloud/ # Cloud connectors (optional)
โ โโโ s3/ # AWS S3 (working)
โ โโโ azure/ # Azure Blob (v1.1+)
โ โโโ gcs/ # Google Cloud (v1.1+)
โโโ bindings/ # Language bindings
โโโ python/ # PyO3 wheel
โโโ java/ # JNI library
โโโ napi/ # Node.js addon
Feature Gates
# Base: zero external dependencies
kore_fileformat = "1.0.0"
# With S3
kore_fileformat = { version = "1.0.0", features = ["s3"] }
# With all cloud (v1.1.0+)
kore_fileformat = { version = "1.0.0", features = ["s3", "azure", "gcs"] }
# With Python bindings
# Use: pip install kore-fileformat
๐ Performance
Kore is designed for analytics workloads:
- Compression: 5-10x reduction on typical datasets
- Query Speed: Columnar format enables fast aggregations
- Storage: 10-50 MB files with millions of rows
- Cloud: Direct S3/Azure/GCS integration (no intermediate files)
๐ ๏ธ Installation
Requirements
- Rust: 1.70+ (for building from source)
- Python: 3.9-3.12 (for Python wheel)
- Java: 17+ (for Java bindings)
- Node.js: 14+ (for JavaScript bindings)
- Docker: 20.10+ (for testing with emulators)
From PyPI (Recommended for Python)
pip install kore-fileformat
From crates.io (Rust)
cargo add kore_fileformat --features s3
Build from Source
git clone https://github.com/arunkatherashala/Kore.git
cd Kore
cargo build --release --features s3
๐งช Testing
Run Unit Tests
cargo test
Run Integration Tests (requires Docker)
# Start emulators (LocalStack, Azurite, GCS)
docker-compose up -d
# Run tests
cargo test --features s3,azure,gcs --test integration_tests -- --nocapture
# Stop emulators
docker-compose down
See DOCKER_EMULATORS_GUIDE.md for detailed setup.
๐ Cloud Integration
AWS S3 (v1.0.0 - Working)
from kore_fileformat import S3Reader
reader = S3Reader(region='us-east-1')
data = reader.read_file('bucket', 'object.kore')
reader.write_file('bucket', 'object.kore', data)
Azure Blob Storage (v1.1.0 - Coming Soon)
from kore_fileformat import AzureBlobReader
reader = AzureBlobReader('account', 'key')
data = reader.read_file('container', 'blob.kore')
Google Cloud Storage (v1.1.0 - Coming Soon)
from kore_fileformat import GcsReader
reader = GcsReader('project-id')
data = reader.read_file('bucket', 'object.kore')
๐ค Contributing
We welcome contributions! Here's how:
- Report Issues: GitHub Issues
- Discuss Ideas: GitHub Discussions
- Submit PRs: Fork, branch, code, and create a pull request
See V1_1_ROADMAP.md for planned features and how to help.
๐ Roadmap
v1.0.0 (Current) โ
- S3 connector with full API
- Python, Java, JavaScript bindings
- Integration tests with emulators
- Complete documentation
v1.1.0 (Q2 2026)
- Azure Blob Storage full implementation
- Google Cloud Storage full implementation
- Performance optimizations
- Streaming support
v2.0.0 (Q4 2026)
- Go language bindings
- Multi-region support
- Caching layer
- Advanced compression
See V1_1_ROADMAP.md for detailed phases and milestones.
๐ฆ Distribution Channels
Latest Versions
| Platform | Package | Version | Link |
|---|---|---|---|
| PyPI | kore-fileformat | 1.0.0 | PyPI |
| Crates.io | kore_fileformat | 1.0.0 | Crates.io |
| Maven | com.arun.kore:kore-cloud-java | 1.0.0 | Coming v1.1 |
| npm | kore-fileformat | 1.0.0 | Coming v1.1 |
๐ Security
Features
- Zero external dependencies in base library
- Optional SDKs are version-pinned and updated regularly
- Integration tests verify cloud connectivity
- GitHub Actions security scanning
Reporting Security Issues
Please email: arunkatherashala@gmail.com
๐ Support & Community
Getting Help
- Documentation: DOCUMENTATION_INDEX.md
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: arunkatherashala@gmail.com
Stay Updated
- GitHub: Star the repository
- Releases: Watch for v1.1.0 announcement
- Email: Subscribe to release notifications
๐ License
Kore is licensed under the Apache License 2.0.
Copyright 2024-2026 Sai Arun Kumar Ktherashala
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
๐ค Author
Sai Arun Kumar Ktherashala
- Email: arunkatherashala@gmail.com
- GitHub: @arunkatherashala
- LinkedIn: Sai Arun Kumar
๐ฏ What's Next?
For Users
- Read: PYTHON_USER_GUIDE.md or DOCKER_EMULATORS_GUIDE.md
- Install:
pip install kore-fileformat - Explore: Check DOCUMENTATION_INDEX.md for your role
For Contributors
- Review: V1_1_ROADMAP.md for v1.1.0 features
- Clone:
git clone https://github.com/arunkatherashala/Kore.git - Setup: Follow DOCKER_EMULATORS_GUIDE.md
- Code: Create feature branch and submit PR
For DevOps
- Setup: CI_CD_SECRETS_SETUP.md for automated publishing
- Monitor: GitHub Actions workflows on each push
- Release: Tag v1.0.1 or v1.1.0 to trigger publishing
โ Project Status
| Phase | Status | Delivered |
|---|---|---|
| Phase 1: Core Library | โ Complete | Base Kore format, compression, serialization |
| Phase 2: Cloud SDKs | โ Partial | S3 working, Azure/GCS coming v1.1 |
| Phase 3: Language Bindings | โ Complete | Python, Java, JavaScript production-ready |
| Phase 4: Integration Tests | โ Complete | 4 comprehensive tests with emulators |
| Phase 5: CI/CD & Publishing | โ Complete | 10 automated jobs, multi-registry support |
| Documentation | โ Complete | 8 guides, 2000+ lines, 50+ examples |
๐ Thank You!
Thank you for choosing Kore! We're excited to see what you build.
Questions? Open an issue or discussion on GitHub.
Want to help? Check V1_1_ROADMAP.md for features to implement.
Found a bug? Report it on GitHub Issues.
Latest Release: v1.0.0
Last Updated: May 14, 2026
Status: Production Ready โ
๐ Let's build amazing data infrastructure together!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kore_fileformat-1.1.4-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: kore_fileformat-1.1.4-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 138.1 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94297436bf5fd5e7f3b2dbcb4e98a8f2576a96eb11f638d3ba1ad75fce9bf666
|
|
| MD5 |
e730d0d24aa6a3d0cd8f0d435ad36efc
|
|
| BLAKE2b-256 |
fc2c65a3ef2b0ddef2b5dfa5a863d85fdca8d12be47f51bde0cde882b02736be
|
Provenance
The following attestation bundles were made for kore_fileformat-1.1.4-cp312-cp312-win_amd64.whl:
Publisher:
publish-pypi.yml on arunkatherashala/Kore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kore_fileformat-1.1.4-cp312-cp312-win_amd64.whl -
Subject digest:
94297436bf5fd5e7f3b2dbcb4e98a8f2576a96eb11f638d3ba1ad75fce9bf666 - Sigstore transparency entry: 1555191164
- Sigstore integration time:
-
Permalink:
arunkatherashala/Kore@d337f137f82032a1ff3a799e6a5cfab4dd5665d5 -
Branch / Tag:
refs/tags/v1.1.4 - Owner: https://github.com/arunkatherashala
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d337f137f82032a1ff3a799e6a5cfab4dd5665d5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file kore_fileformat-1.1.4-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: kore_fileformat-1.1.4-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 276.0 kB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69014da999b98f659863bed9fe02a3c2631a24587472ba9e9787ffb1fbe25c31
|
|
| MD5 |
a4e4a84e59b192504d83078d771ec7a5
|
|
| BLAKE2b-256 |
eef364149943b8ddba849c4b9e744efe608734fdc2bf26887376310e1319af0d
|
Provenance
The following attestation bundles were made for kore_fileformat-1.1.4-cp312-cp312-manylinux_2_34_x86_64.whl:
Publisher:
publish-pypi.yml on arunkatherashala/Kore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kore_fileformat-1.1.4-cp312-cp312-manylinux_2_34_x86_64.whl -
Subject digest:
69014da999b98f659863bed9fe02a3c2631a24587472ba9e9787ffb1fbe25c31 - Sigstore transparency entry: 1555191176
- Sigstore integration time:
-
Permalink:
arunkatherashala/Kore@d337f137f82032a1ff3a799e6a5cfab4dd5665d5 -
Branch / Tag:
refs/tags/v1.1.4 - Owner: https://github.com/arunkatherashala
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d337f137f82032a1ff3a799e6a5cfab4dd5665d5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file kore_fileformat-1.1.4-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: kore_fileformat-1.1.4-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 236.8 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35df346a9da307f69a1829e16f83698b0ea6c8c0eafbe773deb94ff76aa24b11
|
|
| MD5 |
aad3cb6cc053918efeaaee983d5442df
|
|
| BLAKE2b-256 |
ae307d82d347af6c4f00fb74aef8b3e8ecac3c96207c53386c77e513ccec0ec5
|
Provenance
The following attestation bundles were made for kore_fileformat-1.1.4-cp312-cp312-macosx_11_0_arm64.whl:
Publisher:
publish-pypi.yml on arunkatherashala/Kore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kore_fileformat-1.1.4-cp312-cp312-macosx_11_0_arm64.whl -
Subject digest:
35df346a9da307f69a1829e16f83698b0ea6c8c0eafbe773deb94ff76aa24b11 - Sigstore transparency entry: 1555191186
- Sigstore integration time:
-
Permalink:
arunkatherashala/Kore@d337f137f82032a1ff3a799e6a5cfab4dd5665d5 -
Branch / Tag:
refs/tags/v1.1.4 - Owner: https://github.com/arunkatherashala
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d337f137f82032a1ff3a799e6a5cfab4dd5665d5 -
Trigger Event:
push
-
Statement type: