A Python package for text compression using huffman coding.
Project description
iCompress-The-Text-File-Compressor
iCompress is a Python-based file compression tool that uses the Huffman coding algorithm to compress and decompress files. This project is designed to reduce the size of text files by encoding their contents into a more compact binary format. It also provides the capability to decompress the compressed files back to their original form.
Features:
- File Compression: Compresses text files using the Huffman coding algorithm.
- File Decompression: Decompresses previously compressed files to restore the original content.
- Efficient Encoding: Uses a frequency-based Huffman tree to generate optimal binary encodings for file contents.
- Cross-Platform: Written in Python, making it compatible with any system that supports Python 3.
Requirements:
To use this tool, you need to have the following installed on your system:
- Python 3: Ensure Python 3 is installed. You can download it from python.org.
Project Structure:
The project is organized as follows:
iCompress/
__init__.py
compress.py # Main script for compressing files
decompress.py # Main script for decompressing files
services/
compression_core/
encoding_and_decoding.py # Handles encoding and decoding logic
frequency_tree.py # Constructs the Huffman tree
frequency.py # Calculates frequency of bytes
padding.py # Handles padding for binary strings
prepare_dictionary.py # Prepares the Huffman dictionary
file_service/
file_metadata.py # Manages file metadata
file_write.py # Handles writing to files
How It Works:
Compression:
- The
compress.pyscript reads the input file and calculates the frequency of each byte in the file. - A Huffman tree is constructed based on the frequency of each byte. Bytes with higher frequencies are assigned shorter binary codes.
- Using the Huffman tree, each byte in the input file is encoded into a binary string of variable length.
- The encoded binary strings, along with the Huffman tree, are written to the output file. The tree is necessary for decompression.
Decompression:
- The
decompress.pyscript reads the compressed file and reconstructs the Huffman tree from the stored metadata. - Using the reconstructed tree, the binary strings in the compressed file are decoded back into their original bytes.
- The original bytes are written to the output file, restoring the original content.
Usage:
Compressing a File:
To compress a file, navigate to the project directory in your terminal and run the following command:
python compress.py <input_file_path> <binary_file_path>
<input_file_path>: Path to the file you want to compress.<binary_file_path>: Path where the compressed binary file will be saved.
Example:
python compress.py example.txt output.bin
This will compress example.txt and save the compressed output as output.bin.
Decompressing a File:
To decompress a file, run the following command:
python decompress.py <output_file_path> <binary_file_path>
<output_file_path>: Path where the decompressed file will be saved.<binary_file_path>: Path to the compressed binary file.
Example:
python decompress.py decompressed_example.txt output.bin
This will decompress output.bin and save the decompressed content as decompressed_example.txt.
Example Workflow:
-
Compress a File:
- Input:
example.txt - Command:
python compress.py example.txt output.bin - Output:
output.bin
- Input:
-
Decompress the File:
- Input:
output.bin - Command:
python decompress.py decompressed_example.txt output.bin - Output:
decompressed_example.txt
- Input:
Limitations:
- This tool is designed for text files. Compressing binary files may not yield significant size reductions.
- The compressed file includes metadata (Huffman tree), which may slightly increase the file size for very small inputs.
Future Enhancements:
- Add support for compressing binary files.
- Implement multi-threading for faster compression and decompression.
- Provide a graphical user interface (GUI) for ease of use.
License:
This project is distributed under the MIT License. See the LICENSE file for more details.
Acknowledgments:
This project uses the Huffman coding algorithm, a widely used method for lossless data compression. Special thanks to the Python community for providing excellent libraries and resources.
Contact:
For any questions or feedback, please contact the project maintainer at [rahulrathod315@example.com].
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file icompress-0.1.2.tar.gz.
File metadata
- Download URL: icompress-0.1.2.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57dc0bfbdef3797a3fbd136c1ca8feb3663a9191b0f3f037d0a6a1925d6951a0
|
|
| MD5 |
f30d66d28a47a5a88558530ccfa583ce
|
|
| BLAKE2b-256 |
48a4826c6eec6cadd68bc3c1e897992f3520a555fef2f2e2f9e54ce4956ccf73
|
File details
Details for the file icompress-0.1.2-py3-none-any.whl.
File metadata
- Download URL: icompress-0.1.2-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eef51ff3e55670626e5b97cdbd7768ea73e011ada6af50852b4d9e89e7391d96
|
|
| MD5 |
a2e5a8c5debc87a51a486a8d0cf962a3
|
|
| BLAKE2b-256 |
65ce7b3a0736480e274f04a4c3379b53f1541891cf91015b476908275714b8b3
|