Robust and easy to use generic dicoms anonymizer with demographics csv spreadsheet anonymization by hashed ids
Easy to use text extractor, from PDF, DOC, DOCX and other document types, using the awesome Textract, including if necessary using OCR (via Tesseract).
Easy out-of-core computing of recursive dict
Helping file fixity (long term storage of data) via redundant error correcting codes and hash auditing.
Universal errors-and-erasures Reed Solomon codec (error correcting code) in pure Python with extensive documentation