Skip to main content

Datasets downloading/batching/processing in Numpy

Project description

/* SymJAX logo */

All dataset utilities (downloading/loading/batching/processing) in Numpy Continuous integration license Code style: black

This is an under-development research project, not an official product, expect bugs and sharp edges; please help by trying it out, reporting bugs. Reference docs

What is and why doing numpy-datasets ?

  • First, numpy-datasets offers out-of-the-box dataset download and loading only based on Numpy and core Python libraries.
  • Second, numpy-datasets offers utilities such as (mini-)batching a.k.a looping through a dataset one chunk at a time, or preprocessing techniques that are highly suited for machine learning and deep learning pipelines.
  • Third, numpy-datasets offers many options to transparently deal with very large datasets. For example, automatic mini-batching with a priori caching of the next batch, online preprocessing, and the likes.
  • Fourth, numpy-datasets does not only focus on computer vision datasets but also offers plenty in time-series datasets, with a constantly groing collection of implemented datasets.

Minimal Example

import sys
import symjax as sj
import symjax.tensor as T

# create our variable to be optimized
mu = T.Variable(T.random.normal((), seed=1))

# create our cost
cost = T.exp(-(mu-1)**2)

# get the gradient, notice that it is itself a tensor that can then
# be manipulated as well
g = sj.gradients(cost, mu)
print(g)

# (Tensor: shape=(), dtype=float32)

# create the compield function that will compute the cost and apply
# the update onto the variable
f = sj.function(outputs=cost, updates={mu:mu-0.2*g})

for i in range(10):
    print(f())

# 0.008471076
# 0.008201109
# 0.007946267
# ...

Installation

Installation is direct with pip as described in this guide.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numpy-datasets-0.0.2.tar.gz (69.2 kB view details)

Uploaded Source

Built Distribution

numpy_datasets-0.0.2-py3-none-any.whl (79.6 kB view details)

Uploaded Python 3

File details

Details for the file numpy-datasets-0.0.2.tar.gz.

File metadata

  • Download URL: numpy-datasets-0.0.2.tar.gz
  • Upload date:
  • Size: 69.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for numpy-datasets-0.0.2.tar.gz
Algorithm Hash digest
SHA256 210054744760e22efce878bc9ef30a88c2d4855b6ad645e71d0e7accf930b855
MD5 7f56b4b29594525b63c9ee55269ea78d
BLAKE2b-256 2c458843e6bf2e9c48fe803a1cedfb95cd8aaf7bd47cbf0c0523dd1cfee20ead

See more details on using hashes here.

File details

Details for the file numpy_datasets-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: numpy_datasets-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 79.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for numpy_datasets-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9a51796d8d61e23a7bb5b907f4a343f228cfa7ae5af170a37619a08a0b68e645
MD5 b9254efb13db14334f9147c883ebc1ec
BLAKE2b-256 392b225952e600b51c4b9e10a4a6347a5504ba58778db39d8503d172986f871d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page