Like GNU cat, but with autocompletion for S3.
Python framework for fast Vector Space Modelling
Utils for streaming large files (S3, HDFS, GCS, Azure Blob Storage, gzip, bz2...)
Joins large dataframes together
Fast & simple summary for large CSV files
Geographical queries made easy.
Persistent dict in Python, backed up by sqlite3 and pickle, multithread-safe.
Counter for large datasets
Derives type annotations from Sphinx comments in Python source
UNIX cat with read support for S3, SSH, etc.
Tools for indexing gzip files to support random-like access.
Uploads videos to liveleak.com
Performs ElasticSearch bulk and scroll tasks