A Python library for constructing Merkle Trees and validating Proofs
Project description
pymerkle: A Python library for constructing Merkle Trees and validating Proofs
Complete documentation can be found at pymerkle.readthedocs.org
Pymerkle provides a class for binary balanced Merkle-trees (with possibly odd number of leaves) capable of generating audit-proofs, consistency-proofs and inclusion-tests. It supports all hash functions (including SHA3 variations) and encoding types, whereas defense against second-preimage attack is by default activated. It further provides flexible mechanisms for validating the generated proofs and thus easy verification of encrypted data.
It is a zero dependency library (with the inessential exception of tqdm
for displaying progress bars).
Installation
pip3 install pymerkle --pre
Quick example
from pymerkle import MerkleTree, validateProof, validationReceipt
tree = MerkleTree() # Empty SHA256/UTF-8 Merkle-tree with
# defense against second-preimage attack
for _ in range(665): # Update the tree with 666 records
tree.encryptRecord(bytes('%d-th record' % _, 'utf-8'))
_audit = tree.auditProof('12-th record') # Request audit-proof for the given record
validateProof(target=tree.rootHash(), proof=_audit) # Quick validation of the above proof (True)
# Store the tree's current state for later use
oldhash = tree.rootHash()
sublength = tree.length()
# Further encryption of files and objects
tree.encryptObject({'a': 0, 'b': 1}) # One new leaf storing the provided
# object's digest
tree.encryptFileContent('../path/to/file') # One new leaf storing the digest
# of the provided file's content
tree.encryptFilePerLog('../logs/sample_log') # Many new leaves (one for each
# line of the provided file)
_consistency = tree.consistencyProof(oldhash, sublength) # Request consistency-proof for the
# stored state of the Merkle-tree
_receipt = validationReceipt(
target=tree.rootHash(),
proof=_consistency) # Validate proof with receipt
Tree structure
Contrary to most implementations, the Merkle-tree is here always binary balanced, with all nodes except for the exterior ones (leaves) having two parents. This is achieved as follows: upon appending a block of new leaves, instead of promoting a lonely leaf to the next level or duplicating it, a bifurcation node gets created so that trees with the same number of leaves have always identical structure and input clashes among growing strategies be avoided. This standardization is further crucial for:
- fast generation of consistency-proof paths (based on additive decompositions in decreasing powers of 2)
- fast recalculation of the root-hash after appending a new leaf, since only the hashes at the tree's left-most branch need be recalculated
- memory efficiency, since the height as well as total number of nodes with respect to the tree's length is controlled to the minimum. For example, a tree with 9 leaves has 17 nodes in the present implementation, whereas the total number of nodes in the structure described here is 20.
This topology turns out to be identical with that of a binary Sekura tree, depicted in Section 5.4 of
this paper. Follow the straightforward algorithm of the
MerkleTree.update()
method for further insight.
Encryption
Direct encryption of text (string, bytes, bytearray), JSON (dict) and files is supported.
Use accordingly any of the following methods of the MerkleTree
class (all of them internally invoking
the .update()
method for appending newly-created leaves): .encryptRecord()
, .encryptFileConent()
,
.encryptFilePerLog()
, .encryptObject()
, .encryptObjectFromFile()
, .encryptFilePerObject()
See API or Usage for details about arguments and precise functionality.
Proof validation
Direct validation of a Merkle-proof is performed using the validateProof()
function, which modifies the status
of the provided proof appropriately and returns the corresponding boolean. A more elaborate validation
procedure includes generating a receipt with the validation result and storing at will the generated receipt
as a .json
file. This is achieved using the validationReceipt()
function like in the above quick example.
See API or Usage for details about arguments and precise functionality.
Security
Defense against second-preimage attack
Defense against second-preimage attack is by default activated. Roughly speaking, it consists in the following security measures:
-
Before calculating the hash of a leaf, prepend the corresponding record with the null hexadecimal
0x00
-
Before calculating the hash of any interior node, prepend both of its parents' hashes with the unit hexadecimal
0x01
Read the
tests/test_defense.py
file
inside the project's repository to see how to perform second-preimage attacks against the current implementation. In order to disable
defense against second-preimage attack (say, for testing purposes), set security
equal to False
at construction:
tree = MerkleTree(security=False)
Deviation from bitcoin specification
In contrast to the bitcoin specification for Merkle-trees, lonely leaves are not duplicated in order for the tree to remain genuinely binary. Instead, creating bifurcation nodes at the rightmost branch allows the tree to remain both binary and balanced upon any update (see Tree structure above). As a consequence, even if strict security mode were deactivated (see above), the current implementation is structurally invulnerable to denial-of-service attacks exploiting the vilnerability described here (reported as CVE-2012-2459).
Persistence
On-disc persistence is not currently supported.
Given an instance of the MekleTree
class, the minimum required information can be exported with the
.export()
method into a .json
file; invoking the .loadFromFile()
static method reloads the
Merkle-tree from that file in its previously stored state. This can be useful for transmitting the tree's
current state to a trusted party or retrieving the tree from a backup file. Reconstruction of the tree
is uniquely determined by the sequence of stored hashes (see the section Tree structure to understand why).
See API or Usage for details about arguments and precise functionality.
Running tests
You need to have installed pytest
. From inside the project's root directory type
pytest tests/
to run all tests. This might take up to 15-20 minutes, since crypto parts of the code are tested against all possible
combinations of hash algorithm, encoding type and security mode (1620 combinations in total). Use the syntax and flag
arguments of the pytest
command to run only specific tests or have useful info about the tests printed.
Benchmarks
From inside the project's root directory, type
python benchmarks/ -r
to run the script within the benchmarks/
directory. To see further options about configuring your benchmarks, type
python benchmarks/ -h
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pymerkle-4.0.0b3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48a63a74bbabcb797ecf843ec9771aa6ec4cf512742db6e52fd8f8484aa9fa95 |
|
MD5 | 12ca7e2893f73180f507cbbdc0677540 |
|
BLAKE2b-256 | a6ee9b2aa2dd7cd3970cb1acff48e190fd0d67b081c03ace4ed3762c4d51b874 |