10 projects
smart-open
Utils for streaming large files (S3, HDFS, GCS, Azure Blob Storage, gzip, bz2...)
gensim
Python framework for fast Vector Space Modelling
csvinsight
Fast & simple summary for large CSV files
sqlitedict
Persistent dict in Python, backed up by sqlite3 and pickle, multithread-safe.
bounter
Counter for large datasets
koshka
GNU cat over the network with autocompletion
datawelder
Joins large dataframes together
pygeons
Geographical queries made easy.
kutuzov
Derives type annotations from Sphinx comments in Python source
gzipi
Tools for indexing gzip files to support random-like access.