Skip to main content

Compresses files in memory and replaces the original by a .gz file when there is no space on drive.

Project description

cmemgzip

cmemgzip v.0.4.4

A Python 3 Open Source utility created by Carles Mateo.

http://blog.carlesmateo.com/cmemgzip

cmemgzip is created for those ocasions when we are in a Server, and has the drive/s full, there is no disk space, and we don't want to delete the core files/dumps or logs. What cmemgzip does is to read the file in binary mode, to keep completely in memory, to compress it from memory, then ensure it has write permissions on the folder (by creating and empty file), and delete the original file, and write from memory the compressed file. It can also load the file per blocks, and compress those blocks, at the cost of a bit of loss of compression efficiency. For that parameters -m=XXM or -m=YYG is used. Refer to the PDF manual for more details.

This file can be later decompressed by gzip/gunzip or reviewed with zcat.

The default mode: Allocate all the file in memory

In order to be able to do its job, your server or instance, must have enough free memory to allocate all the file in memory, and its version compressed.

For example, in order to cmemgzip a 2.7GB core dump file, you will need:

2.7GB from the original file + 270MB from the compressed file, aprox 3.1GB of RAM Memory free.

The Block mode: Use a chunk size

In the compression by blocks you specify how many Megabytes or Gigabytes will be used to read the Block from the file. Then that block will be compressed in memory and a new block will be loaded.

For example, in order to compress a log file of 2 GB in size, by using an small amount of memory you can run:

cmemgzip -m=100M myfile.log

This will load the file in blocks of 100MB and compress them into memory. For a 2GB log file that result in 200 MB once compressed, using blocks of 100MB, the memory requirements for cmemgzip would be around 300 MB. However, you can specify to use a block size of 10 MB, and then memory required will be only around 220 MB. It depends on how much it is compressed. By general rule for logs, the biggest the block size is, the better savings in disk space you'll get.

Compressing multiple files

Just provide a mask with * instead of a file name.

For instace:

cmemgzip /var/log/*

Risks

With great power comes great responsibility. As every tool that works with files, this tool must be used very carefully. If you have many processes competing to write to the drive, they may fill the space recovered when deleting the original file fast, and make impossible to write its compressed version. On this version 0.2, in that (extreme) situation, it asks for another destination to store the compressed file. This should not happen unless that server was under extreme load. If you compress logs or core dumps, the compression ratio is so high, that is really difficult that this mnay happen. As the space gain is massive. (From 2.7GB uncompressed core dump file, to 268MB when compressed). Use it wisely at your own risk.

Files avoided

cmemgzip will check that files compressed are at least 100 bytes in size, and will cancel the process if the compressed version is bigger than the original file (typically if you attempt to compress an already compressed file).

It will aso avoid deleting the original file if the compressed version is equal or bigger, in size.

It will also skip files which name ends in .gz .gzip .zip .bzip .bzip2 .rar .xz

Installation

Install from PIP for Python 3:

pip3 install cmemgzip

Here is the page for the PIP package: https://pypi.org/project/cmemgzip/

if you don't have pip in your system you can install it in Ubuntu Servers with:

apt install python3-pip

Cloning from the repository:

git clone https://gitlab.com/carles.mateo/cmemgzip.git

Release notes:

This version v. 0.4.1 has been tested with Ubuntu and Windows 10 64 bit. Previous version v. 0.4 has been tested with Ubuntu, Windows 10 Professional, Mac Os X and Ubuntu 20.04 LTS in Raspberry Pi 4.

Version 0.4.1 autodetects Windows and disables colors.

Be careful not to use on programs that keep a fd (File Descriptor) open to the log file, as deleting the original log file will not return the space to the Filesystem. That was tipically the case of some webservers. You should stop the webserver first, or deal with the fds.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmemgzip-0.4.4.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cmemgzip-0.4.4-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file cmemgzip-0.4.4.tar.gz.

File metadata

  • Download URL: cmemgzip-0.4.4.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cmemgzip-0.4.4.tar.gz
Algorithm Hash digest
SHA256 e92bace0d8acd2dadea0936255c9a46e193b68edabede73c43fc7b63c44ddcf2
MD5 87e03654ae9a3ee96932143a36ec1d26
BLAKE2b-256 d8e162e0c742fc78312258f2350201d6dc25a8ce8f9563ed714191c3f5b4dd17

See more details on using hashes here.

File details

Details for the file cmemgzip-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: cmemgzip-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cmemgzip-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6472a226ec851636cd2847938abd52464dc693b1fe99fcda0b52c2f69baf78fc
MD5 cb59008bca59ddc22695ff3a483f4c59
BLAKE2b-256 8ce982b74fbd4d7fbb74bfb7d37b8df1aef5d2dbf591f3caa4d351e5a12061eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page