Skip to main content

Optimizing compiler for evaluating mathematical expressions on CPUs and GPUs.

Project description

Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features:

  • tight integration with NumPy: a similar interface to NumPy’s. numpy.ndarrays are also used internally in Theano-compiled functions.

  • transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only).

  • efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs.

  • speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1 + exp(x)) for large values of x.

  • dynamic C code generation: evaluate expressions faster.

  • extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems.

Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal).

Release Notes

Theano 0.9.0 (20th of March, 2017)

This is a final release of Theano, version 0.9.0, with a lot of new features, interface changes, improvements and bug fixes.

We recommend that everybody update to this version.

Highlights (since 0.8.0):
  • Better Python 3.5 support

  • Better numpy 1.12 support

  • Conda packages for Mac, Linux and Windows

  • Support newer Mac and Windows versions

  • More Windows integration:

    • Theano scripts (theano-cache and theano-nose) now works on Windows

    • Better support for Windows end-lines into C codes

    • Support for space in paths on Windows

  • Scan improvements:

    • More scan optimizations, with faster compilation and gradient computation

    • Support for checkpoint in scan (trade off between speed and memory usage, useful for long sequences)

    • Fixed broadcast checking in scan

  • Graphs improvements:

    • More numerical stability by default for some graphs

    • Better handling of corner cases for theano functions and graph optimizations

    • More graph optimizations with faster compilation and execution

    • smaller and more readable graph

  • New GPU back-end:

    • Removed warp-synchronous programming to get good results with newer CUDA drivers

    • More pooling support on GPU when cuDNN isn’t available

    • Full support of ignore_border option for pooling

    • Inplace storage for shared variables

    • float16 storage

    • Using PCI bus ID of graphic cards for a better mapping between theano device number and nvidia-smi number

    • Fixed offset error in GpuIncSubtensor

  • Less C code compilation

  • Added support for bool dtype

  • Updated and more complete documentation

  • Bug fixes related to merge optimizer and shape inference

  • Lot of other bug fixes, crashes fixes and warning improvements

A total of 123 people contributed to this release since 0.8.0, see list below.

Interface changes:
  • Merged CumsumOp/CumprodOp into CumOp

  • In MRG module:

    • Replaced method multinomial_wo_replacement() with new method choice()

    • Random generator now tries to infer the broadcast pattern of its output

  • New pooling interface

  • Pooling parameters can change at run time

  • Moved softsign out of sandbox to theano.tensor.nnet.softsign

  • Using floatX dtype when converting empty list/tuple

  • Roll make the shift be modulo the size of the axis we roll on

  • round() default to the same as NumPy: half_to_even

Convolution updates:
  • Support of full and half modes for 2D and 3D convolutions including in conv3d2d

  • Allowed pooling of empty batch

  • Implement conv2d_transpose convenience function

  • Multi-cores convolution and pooling on CPU

  • New abstract 3d convolution interface similar to the 2d convolution interface

  • Dilated convolution

GPU:
  • cuDNN: support versoin 5.1 and wrap batch normalization (2d and 3d) and RNN functions

  • Multiple-GPU, synchrone update (via platoon, use NCCL)

  • Gemv(matrix-vector product) speed up for special shape

  • cublas gemv workaround when we reduce on an axis with a dimensions size of 0

  • Warn user that some cuDNN algorithms may produce unexpected results in certain environments for convolution backward filter operations

  • GPUMultinomialFromUniform op now supports multiple dtypes

  • Support for MaxAndArgMax for some axis combination

  • Support for solve (using cusolver), erfinv and erfcinv

  • Implemented GpuAdvancedSubtensor

New features:
  • OpFromGraph now allows gradient overriding for every input

  • Added Abstract Ops for batch normalization that use cuDNN when available and pure Theano CPU/GPU alternatives otherwise

  • Added gradient of solve, tensorinv (CPU), tensorsolve (CPU), searchsorted (CPU), DownsampleFactorMaxGradGrad (CPU)

  • Added Multinomial Without Replacement

  • Allowed partial evaluation of compiled function

  • More Rop support

  • Indexing support ellipsis: a[..., 3]`, a[1,...,3]

  • Added theano.tensor.{tensor5,dtensor5, ...}

  • compiledir_format support device

  • Added New Theano flag conv.assert_shape to check user-provided shapes at runtime (for debugging)

  • Added new Theano flag cmodule.age_thresh_use

  • Added new Theano flag cuda.enabled

  • Added new Theano flag nvcc.cudafe to enable faster compilation and import with old CUDA back-end

  • Added new Theano flag print_global_stats to print some global statistics (time spent) at the end

  • Added new Theano flag profiling.ignore_first_call, useful to profile the new gpu back-end

  • remove ProfileMode (use Theano flag profile=True instead)

Others:
  • Split op now has C code for CPU and GPU

  • theano-cache list now includes compilation times

  • Speed up argmax only on GPU (without also needing the max)

  • More stack trace in error messages

  • Speed up cholesky grad

  • log(sum(exp(...))) now get stability optimized

Other more detailed changes:
  • Added Jenkins (gpu tests run on pull requests in addition to daily buildbot)

  • Removed old benchmark directory and other old files not used anymore

  • Use of 64-bit indexing in sparse ops to allow matrix with more then 231-1 elements

  • Allowed more then one output to be an destructive inplace

  • More support of negative axis

  • Added the keepdims parameter to the norm function

  • Make scan gradient more deterministic

Commiters since 0.8.0:
  • Frederic Bastien

  • Arnaud Bergeron

  • Pascal Lamblin

  • Steven Bocco

  • Ramana Subramanyam

  • Simon Lefrancois

  • Gijs van Tulder

  • Benjamin Scellier

  • khaotik

  • Chiheb Trabelsi

  • Chinnadhurai Sankar

  • Cesar Laurent

  • Reyhane Askari

  • Mohammad Pezeshki

  • Alexander Matyasko

  • Alexandre de Brebisson

  • Mathieu Germain

  • Nan Rosemary Ke

  • Pierre Luc Carrier

  • Olivier Mastropietro

  • Thomas George

  • Saizheng Zhang

  • Iulian Vlad Serban

  • Francesco Visin

  • Caglar

  • Faruk Ahmed

  • Harm de Vries

  • Samira Shabanian

  • Vincent Dumoulin

  • Nicolas Ballas

  • Jakub Sygnowski

  • Jan Schlüter

  • Samira Ebrahimi Kahou

  • Mikhail Korobov

  • Fei Wang

  • Kv Manohar

  • Jesse Livezey

  • Kelvin Xu

  • Matt Graham

  • Ruslana Makovetsky

  • Sina Honari

  • Bryn Keller

  • Ciyong Chen

  • Vitaliy Kurlin

  • Zhouhan LIN

  • Gokula Krishnan

  • Kumar Krishna Agrawal

  • Ozan Çağlayan

  • Vincent Michalski

  • affanv14

  • Amjad Almahairi

  • Ray Donnelly

  • Tim Cooijmans

  • happygds

  • mockingjamie

  • Christos Tsirigotis

  • Florian Bordes

  • Ilya Kulikov

  • RadhikaG

  • Taesup (TS) Kim

  • Ying Zhang

  • Anton Chechetka

  • Karthik Karanth

  • Kirill Bobyrev

  • Rebecca N. Palmer

  • Yang Zhang

  • Yaroslav Ganin

  • Jonas Degrave

  • Liwei Cai

  • Lucas Beyer

  • Michael Harradon

  • Morgan Stuart

  • Tim Gasper

  • Xavier Bouthillier

  • p

  • texot

  • Andrés Gottlieb

  • Ben Poole

  • Bhavishya Pohani

  • Carl Thomé

  • David Bau

  • Dimitar Dimitrov

  • Evelyn Mitchell

  • Fei Zhan

  • Fuchai

  • Fábio Perez

  • Gennadiy Tupitsin

  • Gilles Louppe

  • Greg Ciccarelli

  • He

  • Huan Zhang

  • Kaixhin

  • Kevin Keraudren

  • Maltimore

  • Marc-Alexandre Cote

  • Marco

  • Marius F. Killinger

  • Martin Drawitsch

  • Maxim Kochurov

  • Micah Bojrab

  • Neil

  • Nizar Assaf

  • Rithesh Kumar

  • Rizky Luthfianto

  • Robin Millette

  • Roman Ring

  • Sander Dieleman

  • Sebastin Santy

  • Shawn Tan

  • Wazeer Zulfikar

  • Wojciech Głogowski

  • Yann N. Dauphin

  • gw0 [http://gw.tnode.com/]

  • hexahedria

  • hsintone

  • jakirkham

  • joncrall

  • root

  • superantichrist

  • tillahoffmann

  • valtron

  • wazeerzulfikar

  • you-n-g

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Theano-0.9.0.tar.gz (3.1 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page