Joins large dataframes together
Utils for streaming large files (S3, HDFS, GCS, Azure Blob Storage, gzip, bz2...)
Geographical queries made easy.
Persistent dict in Python, backed up by sqlite3 and pickle, multithread-safe.
Counter for large datasets
Derives type annotations from Sphinx comments in Python source
Python framework for fast Vector Space Modelling
UNIX cat with read support for S3, SSH, etc.
Tools for indexing gzip files to support random-like access.
Fast & simple summary for large CSV files
Uploads videos to liveleak.com
Performs ElasticSearch bulk and scroll tasks