Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Using Machine Learning to learn how to Compress

Project description

Build Status PyPI PyPI HitCount

Try it live at https://shrynk.ai

Features

  • ✓ Compress your data smartly based on Machine Learning
  • ✓ Takes User Requirements in the form of weights for size, write_time and read_time
  • ✓ Trains & caches a model based on compression methods available in the system using packaged data
  • CLI for compressing and decompressing

CLI

shrynk compress myfile.json       # will yield e.g. myfile.json.gz or myfile.json.bz2
shrynk decompress myfile.json.gz  # will yield myfile.json

shrynk compress myfile.csv --size 0 --write 1 --read 0

shrynk benchmark myfile.csv                  # shows benchmark results
shrynk benchmark --predict myfile.csv        # will also show the current prediction
shrynk benchmark --save --predict myfile.csv # will add the result to the training data too

Usage

Installation:

pip install shrynk

Then in Python:

from shrynk import save, load
file_path = save(my_df, "mypath.csv")
# e.g. mypath.csv.bz2
loaded_df = load(file_path)

Add your own data

If you want more control you can do the following:

import pandas as pd
from shrynk import PandasCompressor

df = pd.DataFrame({"a": [1, 2, 3]})

pdc = PandasCompressor("default")
pdc.run_benchmarks(df) # adds data to the default

pdc.train_model(size=3, write=1, read=1)

pdc.predict(df)

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for shrynk, version 0.1.19
Filename, size File type Python version Upload date Hashes
Filename, size shrynk-0.1.19-py2.py3-none-any.whl (4.5 MB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size shrynk-0.1.19.tar.gz (2.8 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page