Skip to main content

A collection of popular datasets for deep learning.

Project description

dbcollection is a library for downloading/parsing/managing datasets via simple methods.
It was built from the ground up to be cross-platform (Windows, Linux, MacOS) and
cross-language (Python, Lua, Matlab, etc.). This is achieved by using the popular HDF5
file format to store (meta)data of manually parsed datasets and Python for scripting.
By doing so, this library can target any platform that supports Python and any language
that has bindings for HDF5.

This package allows to easily manage and load datasets in an easy and simple
way by using HDF5 files as metadata storage. By storing all the necessary metadata
to disk, it allows for huge datasets to be used in systems with reduced
memory usage. Also, once a dataset is setup, it is setup forever! Users can reuse it
as many times as they want/need for a myriad of tasks without having to setup a
dataset each time they hack some code. This lets users focus on more important tasks
fast prototyping without having to spend time managing datasets or creating/modyfing
scripts to load/fetch data from disk.

Main features

Here are some of key features dbcollection provides:

- Simple API to load/download/setup/manage datasets
- Simple API to fetch data of a dataset
- All data is stored in disk, resulting in reduced RAM usage (useful for large datasets)
- Datasets only need to be setup once
- Cross-platform (Windows, Linux, MacOs).
- Easily extensible to other languages that have support for HDF5 files
- Concurrent/parallel data access is possible thanks to the HDF5 file format
- Diverse list of popular datasets are available for use
- All datasets were manually parsed by someone, meaning that some of the quirks were
already solved for you

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
dbcollection-0.1.11-py2.py3-none-any.whl (11.7 MB) Copy SHA256 hash SHA256 Wheel 3.5
dbcollection-0.1.11.tar.gz (11.5 MB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page