lakeflush optimizes data lakes by consolidating small files into larger bundles for big data workloads. Reduces storage overhead and improves processing efficiency.
Project description
lakeflush
lakeflush optimizes data lakes by consolidating small files into larger bundles for big data workloads. Reduces storage overhead and improves processing efficiency.
Efficiently consolidate millions of small files into larger bundles to solve common big data challenges:
✅ Reduces storage overhead – Minimize metadata bloat in HDFS/S3
✅ Boosts processing speed – Fewer files = Faster Spark/Hadoop jobs
✅ Seamless integration – Works with existing data lakes (S3, on-prem)
✅ Smart bundling – Configurable size thresholds and compression
✅ Mult-format Support – Supports Text, JSON and CSV file format
Ideal for:
- IoT sensor data
- Log file aggregation
- ML training datasets
- Data lake optimization
Works on Python 3.11+
pip install lakeflush
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lakeflush-0.1.0.tar.gz.
File metadata
- Download URL: lakeflush-0.1.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ec7936c5a1a01e4e87f9483daf73efc649aaade64c0d24468e13eae58450683
|
|
| MD5 |
167c510dd31b3da4e41e399069d0608e
|
|
| BLAKE2b-256 |
0dcd8f5384bb4bf993f0aeb2d6a45d0406d4879bd0dc3d1bf6543b6ee6b69883
|
File details
Details for the file lakeflush-0.1.0-py3-none-any.whl.
File metadata
- Download URL: lakeflush-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
309f15d037e82556acb02604032094da6e5df7e2d763ce703232a7ae16ffb786
|
|
| MD5 |
163853903ebd6939575058942ace12c0
|
|
| BLAKE2b-256 |
e8305dac2d576102945b863ddaef5afc730be98de517cefafe0195e7b1778569
|