Skip to main content

Python implementation of multiformats protocols.

Project description

multiformats: A Python implementation of multiformat protocols

Generic badge PyPI version PyPI status Checked with Mypy Python package standard-readme compliant

This is a fully compliant Python implementation of the multiformat protocols.

Table of Contents


You can install the latest release from PyPI as follows:

pip install --upgrade multiformats



The varint module implements the unsigned-varint spec. Functionality is provided by the encode and decode functions, converting between non-negative int values and the corresponding varint bytes:

>>> from multiformats import varint
>>> varint.encode(128)
>>> varint.decode(b'\x80\x01')

For advanced usage, see the API documentation.


The multicodec module implements the multicodec spec. The Multicodec class provides a container for multicodec data:

>>> Multicodec("identity", "multihash", 0x00, "permanent", "raw binary")
Multicodec(name='identity', tag='multihash', code=0,
           status='permanent', description='raw binary')

Core functionality is provided by the get, exists, wrap and unwrap functions. The get and exists functions can be used to check whether a multicodec with given name or code is known, and if so to get the corresponding object:

>>> multicodec.exists("identity")
>>> multicodec.exists(code=0x01)
>>> multicodec.get("identity")
Multicodec(name='identity', tag='multihash', code=0,
           status='permanent', description='raw binary')
>>> multicodec.get(code=0x01)
Multicodec(name='cidv1', tag='cid', code=1,
           status='permanent', description='CIDv1')

The wrap and unwrap functions can be use to wrap raw binary data into multicodec data (prepending the varint-encoded multicodec code) and to unwrap multicodec data into a pair of multicodec code and raw binary data:

>>> raw_data = bytes([192, 168, 0, 254])
>>> multicodec_data = wrap("ip4", raw_data)
>>> raw_data.hex()
>>> multicodec_data.hex()
>>> varint.encode(0x04).hex()
'04' #       0x04 ^^^^ is the multicodec code for 'ip4'
>>> codec, raw_data = unwrap(multicodec_data)
>>> raw_data.hex()
>>> codec
Multicodec(name='ip4', tag='multiaddr', code='0x04', status='permanent', description='')

The Multicodec.wrap and Multicodec.unwrap methods perform analogous functionality with an object-oriented API, additionally enforcing that the unwrapped code is actually the code of the multicodec being used:

>>> ip4 = multicodec.get("ip4")
>>> ip4
Multicodec(name='ip4', tag='multiaddr', code='0x04', status='permanent', description='')
>>> raw_data = bytes([192, 168, 0, 254])
>>> multicodec_data = ip4.wrap(raw_data)
>>> raw_data.hex()
>>> multicodec_data.hex()
>>> varint.encode(0x04).hex()
'04' #       0x04 ^^^^ is the multicodec code for 'ip4'
>>> ip4.unwrap(multicodec_data).hex()
>>> ip4.unwrap(bytes.fromhex('00c0a800fe')) # 'identity' multicodec data
multiformats.multicodec.err.ValueError: Found code 0x00 when unwrapping data, expected code 0x04.

The table function can be used to iterate through known multicodecs, optionally restrictiong to one or more tags and/or statuses:

>>> len(list(multicodec.table())) # multicodec.table() returns an iterator
>>> selected = multicodec.table(tag=["cid", "ipld", "multiaddr"], status="permanent")
>>> [m.code for m in selected]
[1, 4, 6, 41, 53, 54, 55, 56, 81, 85, 112, 113, 114, 120,
 144, 145, 146, 147, 148, 149, 150, 151, 152, 176, 177,
 178, 192, 193, 290, 297, 400, 421, 460, 477, 478, 479, 512]

For advanced usage, see the API documentation.


The multibase module implements the multibase spec. The Multibase class provides a container for multibase data:

>>> Multibase(name="base16", code="f",
              status="default", description="hexadecimal")
    Multibase(name='base16', code='f', status='default', description='hexadecimal')

Core functionality is provided by the encode and decode functions, which can be used to encode a bytestring into a string using a chosen multibase encoding and to decode a string into a bytestring using the multibase encoding specified by its first character:

>>> multibase.encode(b"Hello World!", "base32")
>>> multibase.decode('bjbswy3dpeblw64tmmqqq')
b'Hello World!'

The multibase encoding specified by a given string is accessible using the from_str function:

>>> multibase.from_str('bjbswy3dpeblw64tmmqqq')
Multibase(encoding='base32', code='b',
          description='rfc4648 case-insensitive - no padding')

The exists and get functions can be used to check whether a multibase with given name or code is known, and if so to get the corresponding object:

>>> multibase.exists("base32")
>>> multibase.get("base32")
Multibase(encoding='base32', code='b',
          description='rfc4648 case-insensitive - no padding')
>>> multibase.exists(code="f")
>>> multibase.get(code="f")
Multibase(encoding="base16", code="f",
          status="default", description="hexadecimal")

For advanced usage, see the API documentation.


The multihash module implements the multihash spec.

Core functionality is provided by the digest, wrap, unwrap functions, or the correspondingly-named methods Multihash.wrap and Multihash.unwrap of the Multihash class. The digest function and Multihash.digest method can be used to create a multihash digest directly from data:

>>> data = b"Hello world!"
>>> digest = multihash.digest(data, "sha2-256")
>>> digest.hex()
>>> sha2_256 = multihash.get("sha2-256")
>>> digest = sha2_256.digest(data)
>>> digest.hex()

By default, the full digest produced by the hash function is used. Optionally, a smaller digest size can be specified to produce truncated hashes:

>>> digest = multihash.digest(data, "sha2-256", size=20)
#        optional truncated hash size, in bytes ^^^^^^^
>>> multihash_digest.hex()
'1214c0535e4be2b79ffd93291305436bf889314e4a3f' # 20-bytes truncated hash

The unwrap function can be used to extract the raw digest from a multihash digest:

>>> digest.hex()
>>> raw_digest = multihash.unwrap(digest)
>>> raw_digest.hex()

The Multihash.unwrap method performs the same functionality, but additionally checks that the multihash digest is valid for the multihash:

>>> raw_digest = sha2_256.unwrap(digest)
>>> raw_digest.hex()
>>> sha1 = multihash.get("sha1")
>>> (sha2_256.code, sha1.code)
(18, 17)
>>> sha1.unwrap(digest)
err.ValueError: Decoded code 18 differs from multihash code 17.

The wrap function and Multihash.wrap method can be used to wrap a raw digest into a multihash digest:

>>> raw_digest.hex()
>>> multihash.wrap(raw_digest, "sha2-256").hex()
>>> sha2_256.wrap(raw_digest).hex()

The multihash multicodec specified by a given multihash digest is accessible using the from_digest function:

>>> multihash.from_digest(multihash_digest)
Multicodec(name='sha2-256', tag='multihash', code='0x12',
           status='permanent', description='')

Note the both multihash code and digest length are encoded as varints(see varint usage above) and can span multiple bytes:

>>> multihash.get("skein1024-1024")
Multicodec(name='skein1024-1024', tag='multihash', code='0xb3e0',
           status='draft', description='')
>>> multihash.digest(data, "skein1024-1024").hex()
'e0e702800192e08f5143...' # 3+2+128 = 133 bytes in total
#^^^^^^     3-bytes varint for hash function code 0xb3e0
#      ^^^^ 2-bytes varint for hash digest length 128
>>> from multiformats import varint
>>> hex(varint.decode(bytes.fromhex("e0e702")))
>>> varint.decode(bytes.fromhex("8001"))

Data and digests are all bytes objects (above, we represented them as hex strings for clarity):

>>> hash_digest
>>> multihash_digest
# ^^^^     0x12 -> multihash multicodec "sha2-256"
#     ^^^^ 0x14 -> truncated hash length of 20 bytes

If you wish to produce digests for objects of other types, you should encode them into bytes first. For example, the to_bytes(length, byteorder) method can be used to obtain a bytes representation of an integer with given number of bytes and byte ordering, while the encode(encoding) method can be used to obtain a bytes representation of a string with given encoding:

>>> (400).to_bytes(4, byteorder="big")
>>> (400).to_bytes(4, byteorder="little")
>>> "Hello world!".encode("utf-8")
b'Hello world!'
>>> "Hello world!".encode("utf-16")
b'\xff\xfeH\x00e\x00l\x00l\x00o\x00 \x00w\x00o\x00r\x00l\x00d\x00!\x00'
>>> "Hello world!".encode("utf-16-le")
b'H\x00e\x00l\x00l\x00o\x00 \x00w\x00o\x00r\x00l\x00d\x00!\x00'
>>> "Hello world!".encode("utf-16-be")
b'\x00H\x00e\x00l\x00l\x00o\x00 \x00w\x00o\x00r\x00l\x00d\x00!'

For advanced usage, see the API documentation.


The cid module implements the CID spec.

Core functionality is provided by the CID class, which can be imported directly from multiformats:

>>> from multiformats import CID

CIDs can be decoded from bytestrings or (multi)base encoded strings:

>>> cid = CID.decode("zb2rhe5P4gXftAwvA4eXQ5HJwsER2owDyS9sKaQRRVQPn93bA")
>>> cid
CID('base58btc', 1, 'raw',

CIDs can be created programmatically, and their fields accessed individually:

>>> cid = CID("base58btc", 1, "raw",
... "12206e6ff7950a36187a801613426e858dce686cd7d7e3c0fc42ee0330072d245c95")
>>> cid.base
Multibase(name='base58btc', code='z',
          status='default', description='base58 bitcoin')
>>> cid.codec
Multicodec(name='raw', tag='ipld', code='0x55',
           status='permanent', description='raw binary')
>>> cid.hashfun
Multicodec(name='sha2-256', tag='multihash', code='0x12',
           status='permanent', description='')
>>> cid.digest.hex()
>>> cid.raw_digest.hex()

CIDs can be converted to bytestrings or (multi)base encoded strings:

>>> str(cid)
>>> bytes(cid).hex()
>>> cid.encode("base32") # encode with different multibase

Additionally, the CID.peer_id static method can be used to pack the raw hash of a public key into a CIDv1 PeerID, according to the PeerID spec:

>>> pk_bytes = bytes.fromhex(
... "1498b5467a63dffa2dc9d9e069caf075d16fc33fdd4c3b01bfadae6433767d93")
... # a 32-byte Ed25519 public key
>>> peer_id = CID.peer_id(pk_bytes)
>>> peer_id
CID('base32', 1, 'libp2p-key',
#^^   0x00 = 'identity' multihash used (public key length <= 42)
#  ^^ 0x20 = 32-bytes of raw hash digestlength
>>> str(peer_id)

For advanced usage, see the API documentation.


The multiaddr module implements the multiaddr spec.

Core functionality is provided by the Proto class:

>>> from multiformats import Proto
>>> ip4 = Proto("ip4")
>>> ip4
>>> str(ip4)
>>> ip4.codec
Multicodec(name='ip4', tag='multiaddr', code='0x04',
           status='permanent', description='')

Slash notation is used to attach address values to protocols:

>>> a = ip4/""
>>> a
Addr('ip4', '')
>>> str(a)
>>> bytes(a).hex()

Address values can be specified as strings, integers, or bytes-like objects:

>>> ip4/""
Addr('ip4', '')
>>> ip4/bytes([192, 168, 1, 1])
Addr('ip4', '')
>>> udp = Proto("udp")
>>> udp/9090 # int 9090 is converted to str "9090"
Addr('udp', '9090')

Slash notation is also used to encapsulate multiple protocol/address segments into a multiaddr:

>>> quic = Proto("quic") # no addr required
>>> ma = ip4/""/udp/9090/quic
>>> ma
Multiaddr(Addr('ip4', ''), Addr('udp', '9090'), Proto('quic'))
>>> str(ma)

Bytes for multiaddrs are computed according to the (TLV)+ multiaddr encoding:

>>> bytes(ip4/"").hex()
>>> bytes(udp/9090).hex()
>>> bytes(quic).hex()
>>> bytes(ma).hex()

The parse and decode functions create multiaddrs from their human-readable strings and encoded bytes respectively:

    >>> from multiformats import multiaddr
    >>> s = '/ip4/'
    >>> multiaddr.parse(s)
    Multiaddr(Addr('ip4', ''), Addr('udp', '9090'), Proto('quic'))
    >>> b = bytes.fromhex('047f00000191022382cc03')
    >>> multiaddr.decode(b)
    Multiaddr(Addr('ip4', ''), Addr('udp', '9090'), Proto('quic'))

For uniformity of API, the same functionality as the Proto class is provided by the proto function:

>>> ip4 = multiaddr.proto("ip4")
>>> ip4

For advanced usage, see the API documentation.


The API documentation for this package is automatically generated by pdoc.


Please see the contributing file.


MIT © Hashberg Ltd.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiformats-0.1.2.post1.tar.gz (174.6 kB view hashes)

Uploaded Source

Built Distribution

multiformats-0.1.2.post1-py3-none-any.whl (47.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page