30 projects
distributed
Distributed scheduler for Dask
dask
Parallel PyData with Task Scheduling
gcsfs
Convenient Filesystem interface over GCS
s3fs
Convenient Filesystem interface over S3
fsspec
File-system specification
dask-kubernetes
Native Kubernetes integration for Dask
adlfs
Access Azure Datalake Gen1 with fsspec and dask
stac-geoparquet
None
fastparquet
Python support for Parquet file format
partd
Appendable key-value storage
pandas
Powerful data structures for data analysis, time series, and statistics
dask-ml
A library for distributed and parallel machine learning
dask-glm
Generalized Linear Models with Dask
xstac
xstac
kbatch-proxy
Proxy batch job requests to kubernetes.
kbatch
Submit batch jobs to Kubernetes.
rechunker
A library for rechunking arrays
stac-table
Generate STAC Collections for tabular datasets.
dask-xgboost
Interactions between Dask and XGBoost
stac_vrt
Quickly build a GDAL VRT from a STAC Item Collection.
jupyterhub_mlflow_auth
Tornado-based proxy server for MLFlow and JupyterHub.
papermill-mlflow-handler
MLFlow handler for papermill.
mlflow_nbconvert
mlflow-nbconvert
cachey
Caching mindful of computation/storage costs
cyberpandas
IP Address type for pandas
dask-tensorflow
Interactions between Dask and Tensorflow
engarde
A python package for defensive data analysis.
knotr
Reproducible report generation tool.
dsadd
A python package for defensive data analysis.
python-cps
A python package for working with the[Current Population Survey](http://www.census.gov/cps/).