Skip to main content

A Python package for text compression using huffman coding.

Project description

iCompress-The-Text-File-Compressor

iCompress is a Python-based file compression tool that uses the Huffman coding algorithm to compress and decompress files. This project is designed to reduce the size of text files by encoding their contents into a more compact binary format. It also provides the capability to decompress the compressed files back to their original form.


Features:

  1. File Compression: Compresses text files using the Huffman coding algorithm.
  2. File Decompression: Decompresses previously compressed files to restore the original content.
  3. Efficient Encoding: Uses a frequency-based Huffman tree to generate optimal binary encodings for file contents.
  4. Cross-Platform: Written in Python, making it compatible with any system that supports Python 3.

Requirements:

To use this tool, you need to have the following installed on your system:

  • Python 3: Ensure Python 3 is installed. You can download it from python.org.

Project Structure:

The project is organized as follows:

iCompress/
    __init__.py
    compress.py          # Main script for compressing files
    decompress.py        # Main script for decompressing files
    services/
        compression_core/
            encoding_and_decoding.py  # Handles encoding and decoding logic
            frequency_tree.py         # Constructs the Huffman tree
            frequency.py              # Calculates frequency of bytes
            padding.py                # Handles padding for binary strings
            prepare_dictionary.py     # Prepares the Huffman dictionary
        file_service/
            file_metadata.py          # Manages file metadata
            file_write.py             # Handles writing to files

How It Works:

Compression:

  1. The compress.py script reads the input file and calculates the frequency of each byte in the file.
  2. A Huffman tree is constructed based on the frequency of each byte. Bytes with higher frequencies are assigned shorter binary codes.
  3. Using the Huffman tree, each byte in the input file is encoded into a binary string of variable length.
  4. The encoded binary strings, along with the Huffman tree, are written to the output file. The tree is necessary for decompression.

Decompression:

  1. The decompress.py script reads the compressed file and reconstructs the Huffman tree from the stored metadata.
  2. Using the reconstructed tree, the binary strings in the compressed file are decoded back into their original bytes.
  3. The original bytes are written to the output file, restoring the original content.

Usage:

Compressing a File:

To compress a file, navigate to the project directory in your terminal and run the following command:

python compress.py <input_file_path> <binary_file_path>
  • <input_file_path>: Path to the file you want to compress.
  • <binary_file_path>: Path where the compressed binary file will be saved.

Example:

python compress.py example.txt output.bin

This will compress example.txt and save the compressed output as output.bin.


Decompressing a File:

To decompress a file, run the following command:

python decompress.py <output_file_path> <binary_file_path>
  • <output_file_path>: Path where the decompressed file will be saved.
  • <binary_file_path>: Path to the compressed binary file.

Example:

python decompress.py decompressed_example.txt output.bin

This will decompress output.bin and save the decompressed content as decompressed_example.txt.


Example Workflow:

  1. Compress a File:

    • Input: example.txt
    • Command: python compress.py example.txt output.bin
    • Output: output.bin
  2. Decompress the File:

    • Input: output.bin
    • Command: python decompress.py decompressed_example.txt output.bin
    • Output: decompressed_example.txt

Limitations:

  • This tool is designed for text files. Compressing binary files may not yield significant size reductions.
  • The compressed file includes metadata (Huffman tree), which may slightly increase the file size for very small inputs.

Future Enhancements:

  • Add support for compressing binary files.
  • Implement multi-threading for faster compression and decompression.
  • Provide a graphical user interface (GUI) for ease of use.

License:

This project is distributed under the MIT License. See the LICENSE file for more details.


Acknowledgments:

This project uses the Huffman coding algorithm, a widely used method for lossless data compression. Special thanks to the Python community for providing excellent libraries and resources.


Contact:

For any questions or feedback, please contact the project maintainer at [rahulrathod315@example.com].

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

icompress-0.1.2.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

icompress-0.1.2-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file icompress-0.1.2.tar.gz.

File metadata

  • Download URL: icompress-0.1.2.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for icompress-0.1.2.tar.gz
Algorithm Hash digest
SHA256 57dc0bfbdef3797a3fbd136c1ca8feb3663a9191b0f3f037d0a6a1925d6951a0
MD5 f30d66d28a47a5a88558530ccfa583ce
BLAKE2b-256 48a4826c6eec6cadd68bc3c1e897992f3520a555fef2f2e2f9e54ce4956ccf73

See more details on using hashes here.

File details

Details for the file icompress-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: icompress-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for icompress-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 eef51ff3e55670626e5b97cdbd7768ea73e011ada6af50852b4d9e89e7391d96
MD5 a2e5a8c5debc87a51a486a8d0cf962a3
BLAKE2b-256 65ce7b3a0736480e274f04a4c3379b53f1541891cf91015b476908275714b8b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page