Skip to main content

lakeflush optimizes data lakes by consolidating small files into larger bundles for big data workloads. Reduces storage overhead and improves processing efficiency.

Project description

lakeflush

PyPI License

lakeflush optimizes data lakes by consolidating small files into larger bundles for big data workloads. Reduces storage overhead and improves processing efficiency.

Efficiently consolidate millions of small files into larger bundles to solve common big data challenges:

Reduces storage overhead – Minimize metadata bloat in HDFS/S3
Boosts processing speed – Fewer files = Faster Spark/Hadoop jobs
Seamless integration – Works with existing data lakes (S3, on-prem)
Smart bundling – Configurable size thresholds and compression
Mult-format Support – Supports Text, JSON and CSV file format

Ideal for:

  • IoT sensor data
  • Log file aggregation
  • ML training datasets
  • Data lake optimization

Works on Python 3.11+

pip install lakeflush

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lakeflush-0.1.0.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lakeflush-0.1.0-py3-none-any.whl (31.3 kB view details)

Uploaded Python 3

File details

Details for the file lakeflush-0.1.0.tar.gz.

File metadata

  • Download URL: lakeflush-0.1.0.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for lakeflush-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7ec7936c5a1a01e4e87f9483daf73efc649aaade64c0d24468e13eae58450683
MD5 167c510dd31b3da4e41e399069d0608e
BLAKE2b-256 0dcd8f5384bb4bf993f0aeb2d6a45d0406d4879bd0dc3d1bf6543b6ee6b69883

See more details on using hashes here.

File details

Details for the file lakeflush-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lakeflush-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for lakeflush-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 309f15d037e82556acb02604032094da6e5df7e2d763ce703232a7ae16ffb786
MD5 163853903ebd6939575058942ace12c0
BLAKE2b-256 e8305dac2d576102945b863ddaef5afc730be98de517cefafe0195e7b1778569

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page