hamming-check: File integrity checker
Project description
hamming-check
A command line tool and python library to encode and decode data using a generic (in byte size) hamming code algorithm.
Hamming Code
Hamming code is a set of error-correction codes that can be used to detect and correct the errors that can occur when the data is moved or stored from the sender to the receiver. It is technique developed by R.W. Hamming for error correction.
You can find more about it on his Wikipedia Article, MSU notes and in the awesome videos by 3Blue1Brown: Hamming pt1 and Hamming pt2.
Installing
Locally
Clone the repo.
git clone git@github.com:Tomcat-42/hamming_check.git
Run setup.py
sudo python setup.py install
Using pip
hamming_check
is available on pypi.
sudo pip install hamming_check
Command Line Interface
Description
hamming_check
is a cli tool that is intended to help creating secure copies of a file in a hamming encoded output file, and fixing that secure file for single bit corruptions. Also it can check for double bit corruptions, but could not fix that type of error.
Usage
usage: hamming_check [-h] (-e | -d) [-v] [-b BUFFER_SIZE]
[input_file] [output_file]
positional arguments:
input_file file used for reading data. If not specified,
data is read from stdin.
output_file file used for writing data. If not specified,
data is written to stdout.
options:
-h, --help show this help message and exit
-e, --encode encode a file into a hamming-encoded file
-d, --decode decode a hamming-encoded file into a file
-v, --verbose increase output verbosity (can be used
multiple times)
-b BUFFER_SIZE, --buffer-size BUFFER_SIZE
change the buffer size (in bytes) used for
encoding/decoding
- input_file: original file that will be secure copied or a secure file that will be recovered. If not provided, data will be read from STDIN.
- output_file: secure file that will be created from a file or a file that will be recovered from a secure file. If not provided, data will be written to STDOUT.
- -e|--encode: Sets the encoding operation. File -> Secure File.
- -d|--decode: Sets the decoding operation. Secure File -> File with error checking/correction.
- -b|--buffer-size: Sets the number of bytes that will be used for the hamming code, default is 1. Higher Values tends to speed up encoding.
- -v: Sets the verbosity. If not provided, will be in quiet mode, if
-v
, only errors will be printed,-vv
will print the result of the encoding/decoding operations and-vvv
will print all of the hamming algorithm steps. - -h: prints the help text.
Examples
- Encode the file cat.jpg into the secure file cat.jpg.wham
hamming_check -e cat.jpg cat.jpg.wham
- Decode the secure file cat.jpg.wham into the file cat.jpg.wham
hamming_check -d cat.jpg.wham cat.jpg
- Encode the file cat.jpg into the secure file cat.jpg.wham using a 4096 bytes hamming code
hamming_check -e -b 4096 cat.jpg cat.jpg.wham
- decode the secure file cat.jpg.wham into the file cat.jpg using a 4096 bytes hamming code
hamming_check -d -b 4096 cat.jpg.wham cat.jpg
- Encode the string "test" into the secure file file.txt.wham
echo -n "test" | hamming_check -e file.txt.wham
- Encode the string "test" and print the encoded result to STDOUT
echo -n "test" | hamming_check -e
- Decode the encoded string and print the decoded result to STOUT
echo -n <STR> | hamming_check -d
- Decode the encoded string and save the result to file.txt
echo -n <STR> | hamming_check -d file.txt
- Decode the file.txt.wham and print the results to STDOUT
hamming_check -d file.txt.wham
hamming_check
library
Description
hamming_check
is a library for encoding and decoding binary data using the hamming code.
Usage
Hamming
Module
Encode and decodes datas using the hamming code of a given buffer_size
in bytes.
from hamming_check import Hamming, DecodeStatus, DecodeResult, VerbosityTypes
...
hamming = Hamming(buffer_size=1, verbose=VerbosityTypes.QUIET)
size_of_encoded_data = hamming.get_number_of_output_bytes()
encoded_data = hamming.encode(b't')
...
decoded_result = hamming.decode(encode)
decoded_data, decoded_status = decoded_result.get_data(), decoded_result.get_status()
io
Module
Abstractions over files and bytes. The Bytes
class is inherited from the bitarray and the Files
class is just a wrapper for the python file interface.
from hamming_check import Hamming, DecodeStatus, DecodeResult, VerbosityTypes, File, Bytes
...
hamming = Hamming(buffer_size=2, verbose=VerbosityTypes.QUIET)
input_file = File(open("input_file.txt", "rb"), bytes_per_read=2)
output_file = File(open("output_file.txt", "wb"))
# read data, encodes it, flips a bit and then write
for data in input_file:
encoded_data = hamming.encode(data)
bytes = Bytes(encoded_data)
bytes[0] ^= 1
output_file.write(bytes.tobytes())
input_file.close()
output_file.close()
Example
Send a encoded file over the network and check it for corruption.
Client Code
- client.py: Read a image 4096 bytes per time, encode that chunk of bytes, add a random noise to the encoded data and sends it over the network.
#!/usr/bin/env python
from random import randint, random
import socket
from argparse import ArgumentParser
from math import e
from hamming_check.hamming import Hamming
def main():
# argparser
parser = ArgumentParser()
parser.add_argument("-p", "--port", type=int, default=8080)
parser.add_argument("-f", "--file", type=str)
parser.add_argument("-b", "--bytes", type=int, default=4096)
parser.add_argument("-d", "--double-noise", action="store_true")
args = parser.parse_args()
# opens the socket connection and the file
s = socket.socket()
s.connect(("localhost", args.port))
filetosend = open(args.file, "rb")
# Hamming check
hamming = Hamming(args.bytes)
bytes_to_send = hamming.get_number_of_output_bytes()
# sends the encoded
while data := filetosend.read(args.bytes):
encoded_data = bytearray(hamming.encode(data))
# 30% chance of sending the data with noise
if random() > 0.3:
print("Sending data with noise")
encoded_data[randint(0, bytes_to_send)] ^= 1 << randint(0, 7)
# if enabled, 50% of chance to add double noise to data
if args.double_noise and random() > 0.5:
print("Sending data with double noise")
encoded_data[randint(0, bytes_to_send)] ^= 1 << randint(0, 7)
s.send(encoded_data)
filetosend.close()
s.send(b"DONE")
print("Done Sending.")
s.shutdown(2)
s.close()
exit(0)
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\nExiting...")
Server Code
- server.py: Receives encoded data throught the network, decodes it, tries to recover noisy data and then sava it to a output file
#!/usr/bin/env python
import socket
from argparse import ArgumentParser
from hamming_check.hamming import DecodeResult, DecodeStatus, Hamming
from hamming_check.types.verbosity_types import VerbosityTypes
def main() -> None:
# ArgumentParser
parser = ArgumentParser()
parser.add_argument("-p", "--port", type=int, default=8080)
parser.add_argument("-f", "--file", type=str)
parser.add_argument("-b", "--bytes", type=int, default=4096)
args = parser.parse_args()
# opens socket
s = socket.socket()
s.bind(("localhost", args.port))
s.listen(1)
c, a = s.accept()
filetodown = open(args.file, "wb")
# Hamming check
hamming = Hamming(args.bytes, VerbosityTypes.QUIET)
bytes_to_receive = hamming.get_number_of_output_bytes()
while True:
data = c.recv(bytes_to_receive, socket.MSG_WAITALL)
if data == b"DONE" or len(data) == 0:
print("Done Receiving.")
break
encoded_data = hamming.decode(data)
# if status is not DecodeStatus.NO_ERROR or
# DecodeStatus.SINGLE_ERROR_CORRECTED, then we have a problem
bytes_received, status = encoded_data.get_data(
), encoded_data.get_status()
if status == DecodeStatus.SINGLE_ERROR_CORRECTED:
print("One error detected, and corrected")
elif status == DecodeStatus.DOUBLE_ERROR_DETECTED:
print("Two errors detected, your file is corrupted")
filetodown.write(bytes_received)
filetodown.flush()
filetodown.close()
c.shutdown(2)
c.close()
s.close()
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\nBye!")
Putting all together
Run server code
./examples/send_over_network/server.py -f out.jpg
Run client code
./examples/send_over_network/examples.py -f ./examples/send_over_network/really_cool_cat.jpg
Check out.jpg
Even though was added noise to the data, the server was able to recover the image.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file hamming-check-1.0.2.tar.gz
.
File metadata
- Download URL: hamming-check-1.0.2.tar.gz
- Upload date:
- Size: 15.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c711cbfcd478d8bd2c175bda09ce107d46105dc40f27c85ca47eae5ce8c1e2a |
|
MD5 | 953d2bf3116292716e7f8bf6bfadb9a4 |
|
BLAKE2b-256 | a96494dd7d2d46e2a3bf0c4aeab532670673bade07b6d260dd5a30dff4195102 |
File details
Details for the file hamming_check-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: hamming_check-1.0.2-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86504aeae08dd48e855aba19e1aa51d3a6bc6ea7a8c4bd94f0754b6577c25742 |
|
MD5 | b05cc48dd08837ef1055d0f448f7c01a |
|
BLAKE2b-256 | 5513d3a145a712132dcaaaf0d63749dba7749fa37e47e7fc0780459bcafd9fbd |