Skip to main content

Pure-Python Reed Solomon encoder/decoder

Project description

PyPI-Status PyPI-Versions PyPI-Downloads

Build-Status Coverage

A pure-python universal errors-and-erasures Reed-Solomon Codec , based on the wonderful tutorial at wikiversity, written by “Bobmath” and “LRQ3000”.


Installation

pip install --upgrade reedsolo

Usage

Basic usage with high-level RSCodec class

# Initialization
>>> from reedsolo import RSCodec
>>> rsc = RSCodec(10)  # 10 ecc symbols

# Encoding
>>> rsc.encode([1,2,3,4])
b'\x01\x02\x03\x04,\x9d\x1c+=\xf8h\xfa\x98M'
>>> rsc.encode(bytearray([1,2,3,4]))
bytearray(b'\x01\x02\x03\x04,\x9d\x1c+=\xf8h\xfa\x98M')
>>> rsc.encode(b'hello world')
b'hello world\xed%T\xc4\xfd\xfd\x89\xf3\xa8\xaa'
# Note that chunking is supported transparently to encode any string length.

# Decoding (repairing)
>>> rsc.decode(b'hello world\xed%T\xc4\xfd\xfd\x89\xf3\xa8\xaa')[0]
b'hello world'
>>> rsc.decode(b'heXlo worXd\xed%T\xc4\xfdX\x89\xf3\xa8\xaa')[0]     # 3 errors
b'hello world'
>>> rsc.decode(b'hXXXo worXd\xed%T\xc4\xfdX\x89\xf3\xa8\xaa')[0]     # 5 errors
b'hello world'
>>> rsc.decode(b'hXXXo worXd\xed%T\xc4\xfdXX\xf3\xa8\xaa')[0]        # 6 errors - fail
Traceback (most recent call last):
  ...
reedsolo.ReedSolomonError: Too many (or few) errors found by Chien Search for the errata locator polynomial!

>>> rsc = RSCodec(12)  # using 2 more ecc symbols (to correct max 6 errors or 12 erasures)
>>> rsc.encode(b'hello world')
b'hello world?Ay\xb2\xbc\xdc\x01q\xb9\xe3\xe2='
>>> rsc.decode(b'hello worXXXXy\xb2XX\x01q\xb9\xe3\xe2=')[0]         # 6 errors - ok, but any more would fail
b'hello world'
>>> rsc.decode(b'helXXXXXXXXXXy\xb2XX\x01q\xb9\xe3\xe2=', erase_pos=[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16])[0]  # 12 erasures - OK
b'hello world'

Note: this shows that we can decode twice as many erasures (where we provide the location of errors ourselves) than errors (with unknown locations). This is the cost of error correction compared to erasure correction.

# Checking
>> rsc.check(b'hello worXXXXy\xb2XX\x01q\xb9\xe3\xe2=')  # Tampered message will return False
[False]
>> rmes, rmesecc, errata_pos = rsc.decode(b'hello worXXXXy\xb2XX\x01q\xb9\xe3\xe2=')
>> rsc.check(rmesecc)  # Corrected message will return True
[True]
>> print('Number of detected errors and erasures: %i, their positions: %s' % (len(errata_pos), list(errata_pos)))
Number of detected errors and erasures: 6, their positions: [16, 15, 12, 11, 10, 9]

# To use longer chunks or bigger values than 255 (may be very slow)
>> rsc = RSCodec(12, nsize=4095)  # always use a power of 2 minus 1
>> rsc = RSCodec(12, c_exp=12)  # alternative way to set nsize=4095
>> mes = 'a' * (4095-12)
>> mesecc = rsc.encode(mes)
>> mesecc[2] = 1
>> mesecc[-1] = 1
>> rmes, rmesecc, errata_pos = rsc.decode(mesecc)
>> rsc.check(mesecc)
[False]
>> rsc.check(rmesecc)
[True]

Low-level usage via direct access to math functions

If you want full control, you can skip the API and directly use the library as-is. Here’s how:

First you need to init the precomputed tables:

>> import reedsolo as rs
>> rs.init_tables(0x11d)

Pro tip: if you get the error: ValueError: byte must be in range(0, 256), please check that your prime polynomial is correct for your field. Pro tip2: by default, you can only encode messages of max length and max symbol value = 256. If you want to encode bigger messages, please use the following (where c_exp is the exponent of your Galois Field, eg, 12 = max length 2^12 = 4096):

>> prim = rs.find_prime_polys(c_exp=12, fast_primes=True, single=True)
>> rs.init_tables(c_exp=12, prim=prim)

Let’s define our RS message and ecc size:

>> n = 255  # length of total message+ecc
>> nsym = 12  # length of ecc
>> mes = "a" * (n-nsym)  # generate a sample message

To optimize, you can precompute the generator polynomial:

>> gen = rs.rs_generator_poly_all(n)

Then to encode:

>> mesecc = rs.rs_encode_msg(mes, nsym, gen=gen[nsym])

Let’s tamper our message:

>> mesecc[1] = 0

To decode:

>> rmes, recc, errata_pos = rs.rs_correct_msg(mesecc, nsym, erase_pos=erase_pos)

Note that both the message and the ecc are corrected (if possible of course). Pro tip: if you know a few erasures positions, you can specify them in a list erase_pos to double the repair power. But you can also just specify an empty list.

You can check how many errors and/or erasures were corrected, which can be useful to design adaptive bitrate algorithms:

>> print('A total of %i errata were corrected over all chunks of this message.' % len(errata_pos))

If the decoding fails, it will normally automatically check and raise a ReedSolomonError exception that you can handle. However if you want to manually check if the repaired message is correct, you can do so:

>> rs.rs_check(rmes + recc, nsym)

Note: if you want to use multiple reedsolomon with different parameters, you need to backup the globals and restore them before calling reedsolo functions:

>> rs.init_tables()
>> global gf_log, gf_exp, field_charac
>> bak_gf_log, bak_gf_exp, bak_field_charac = gf_log, gf_exp, field_charac

Then at anytime, you can do:

>> global gf_log, gf_exp, field_charac
>> gf_log, gf_exp, field_charac = bak_gf_log, bak_gf_exp, bak_field_charac
>> mesecc = rs.rs_encode_msg(mes, nsym)
>> rmes, recc, errata_pos = rs.rs_correct_msg(mesecc, nsym)

The globals backup is not necessary if you use RSCodec, it will be automatically managed.

Read the sourcecode’s comments for more info about how it works, and for the various parameters you can setup if you need to interface with other RS codecs.

Extended description

The code of wikiversity is here consolidated into a nice API with exceptions handling. The algorithm can correct up to 2*e+v <= nsym, where e is the number of errors, v the number of erasures and nsym = n-k = the number of ECC (error correction code) symbols. This means that you can either correct exactly floor(nsym/2) errors, or nsym erasures (errors where you know the position), and a combination of both errors and erasures. The code should work on pretty much any reasonable version of python (2.4-3.7), but I’m only testing on 2.7 and 3.7. Python 3.8 should work except for Cython which is currently incompatible with this version.

The codec has quite reasonable performances if you either use PyPy on the pure-python implementation (reedsolo.py) or either if you compile the Cython extension creedsolo.py (which is about 2x faster than PyPy). You can expect encoding rate of several MB/s.

This library is also thoroughly unit tested so that any encoding/decoding case should be covered.

To use the Cython implementation, you need to pip install cython and a C++ compiler (Microsoft Visual C++ 14.0 for Windows and Python 3.7). Then you can simply cd to the root of the folder where creedsolo.pyx is, and type python setup.py build_ext –inplace. Alternatively, you can generate just the C++ code by typing cython -3 creedsolo.pyx.

Authors

This module was conceived and developed by Tomer Filiba.

It was further extended and is currently maintained by Stephen Karl Larroque.

License

This software is released to the Public Domain.

If the Public Domain is not adequate for your purpose, you can instead consider this module under the MIT License as you prefer.

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reedsolo-1.4.10.tar.gz (262.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

reedsolo-1.4.10-cp37-cp37m-win_amd64.whl (181.4 kB view details)

Uploaded CPython 3.7mWindows x86-64

File details

Details for the file reedsolo-1.4.10.tar.gz.

File metadata

  • Download URL: reedsolo-1.4.10.tar.gz
  • Upload date:
  • Size: 262.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for reedsolo-1.4.10.tar.gz
Algorithm Hash digest
SHA256 be9717c4536e8542915908d9af7392ac50107b5cc891028b4bb8cea88f3b1432
MD5 d3df3faea9a9013d39b4fd09e9165440
BLAKE2b-256 1c9eb976539d30f3ae1a1d094b7874332088dc31e517b16547de027016d0ad1c

See more details on using hashes here.

File details

Details for the file reedsolo-1.4.10-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: reedsolo-1.4.10-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 181.4 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for reedsolo-1.4.10-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 9e0089892f0e0bf78bb02f0623f5cf7a764c39994ae1a1333226af46bec49444
MD5 79d5bf7fabd12a78e3ecebd00771a89f
BLAKE2b-256 55e46d216ee63b880d5fdda6334718d91ee78f1e6d262041f8b206e6b6ddcc56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page