Skip to main content

Optimizing compiler for evaluating mathematical expressions on CPUs and GPUs.

Project description

Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy_. Theano features:

* **tight integration with NumPy:** a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions.
* **transparent use of a GPU:** perform data-intensive computations up to 140x faster than on a CPU (support for float32 only).
* **efficient symbolic differentiation:** Theano can compute derivatives for functions of one or many inputs.
* **speed and stability optimizations:** avoid nasty bugs when computing expressions such as log(1 + exp(x)) for large values of x.
* **dynamic C code generation:** evaluate expressions faster.
* **extensive unit-testing and self-verification:** includes tools for detecting and diagnosing bugs and/or potential problems.

Theano has been powering large-scale computationally intensive scientific
research since 2007, but it is also approachable enough to be used in the
classroom (IFT6266 at the University of Montreal).

.. _NumPy:

.. _NEWS:

Release Notes

Theano 0.6rc2 (November 21th, 2012)

* Fix a few regression inserted in 0.6rc1
* A few new features.
* Speed up.
* Scan fix.
* Crash fix

Commiters for this rc2 only:
Razvan Pascanu
Pascal Lamblin
Frederic Bastien
Ian Goodfellow
Jeremiah Lowin
Caglar Gulcehre
Jey Kottalam
Matthew Rocklin

Regression in 0.6rc1 fixed:
* Fix the scan gradient dtype issue. In 0.6rc1, some upcast where inserted. (Razvan P.)
* Now grad() will do as before the 0.6rc1 for float, i.e. the grad dtype will be the same as the inputs inside the graph. If you ask for the direct grad, it will return the computed dtype. (Pascal L.)

Wrong results fix:
* Scan fix in some case didn't returned the good results. (Razvan P., reported by Jeremiah L.)
This happen if you have a state with only neg tap and the outputs of the state is a function of some sequence.
If you have multiple state, there was no problem.
* Fixed bug in Scan with multiple outputs,
where one output would sometimes overwrite another one. (Razvan P.)
* Clip.grad treated the gradient with respect to the clipping boundary as always 0. (Ian G.)

Interface change:
* Now we do not support unaligned ndarray in python code. (Frederic B.)
We did not support it in c code and supporting it in python code made
the detection harder.
* Now we only support officialy scipy 0.7.2 and numpy 1.5.0 (Frederic B.)
We weren't and aren't testing with older version.
* The theano.sparse.SparseType is available even when scipy is not (Frederic B.)
* Fixes issue where members of consider_constant grad parameter
were treated differently from Constant variables. (Ian G.)
* Remove the parameter g_cost to theano.grad(). (Ian G.)
Use the new more powerfull parameter known_grads instead.

NumPy interface support:
* theano.tensor.where is an alias for theano.tensor.switch to support NumPy semantic. (Ian G.)
* TensorVariable objects now have dot, argmin, argmax, clip, conj, repeat, trace, std, round,
ravel and argsort functions and the real and imag properties as numpy.ndarray object.
The functionality was already available in Theano. (abalkin)

Speed up:
* A C version of the SoftMax op (Razvan P.)
There was c code for the softmax with bias code.
* Faster GpuIncSubtensor (Ian G.)
* Faster copy on the GPU for 4d tensor. (Ian G.)
* The fix of flatten infer_shape re-enable an optimization (Pascal L.)
* The bug was introduced in 0.6rc1.
* Enable inc_subtensor on the GPU when updating it with a float64 dtype. (Ian G.)
It was causing an optimization warning.
* Make DeepCopy reuse preallocated memory. (Frederic B.)
* Move then convolution to the GPU when the image shape and logical image shape differ. (Frederic Bastien)
* C code for the View Op (Razvan P., Pascal L.)

New Feature:
* Added a monitoring mode "MonitorMode" as a debugging tool. (Olivier D.)
* Allow integer axes when keepdims==True (Jeremiah Lowin)
* Add erfinv and erfcinv op. (Jey Kottalam)
* Added tensor.batched_dot(). (Caglar Gulcehre)
It use scan behind the scene, but making doing this easier.
* theano.get_constant_value(x) (Frederic B.)
This try to do have x as a constant int.
This do some constant folding to try to convert x into an int.
Used by some optimization.
* Add{MPIRecv,MPIRecvWait,MPISend,MPISendWait} (Matthew Rocklin)
Theano do not automatically use them. It is up to you to use them and split your computation.
* Added theano.sandbox.linalg.eig (abalkin)
* Started some support for Python3 (abalkin) support python3 now.
It call 2to3 during the setup.
Python3 not fully supported as we didn't update the c code.

Crash Fix:
* Fix a crash related to scan.grad due to the new mechanism. (Ian G.)
* Fix an optimization warning. Now it get optimized. (Frederic B.)
* Fix crash introduced in 0.6rc1 in theano.grad (Ian G.)
* Fix crash introduced in 0.6rc1 in the grad of scan (Razvan P.)
* Fix crash introduced in 0.6rc1 in the grad of clip (Ian G.)
Also implement the gradient on the min/max bound.
* Fix crash in the grad of tensor.switch for int (Ian G.)
* Fix crash when mixing shared variable on the GPU and sparse dot. (Pascal L.)
* Fix crash as sometimes would return a different dtype number
that is equivalent but not the one expected. (Pascal L., reported by Rami Al-Rfou)
* Better error msg (Ian G.)
* Move all sparse random function back to sandbox as it don't have a state inside Theano. (Pascal L.)
They where moved outside the sandbox in 0.6rc1
* LoadFromDisk now is allowed to take only support some memmap mode. (Pascal L.)
Otherwise, this was causing errors, segmentation faults or wrong results.
* Fix import problem on PiCloud (Jeremiah Lowin)
* You need to use the c|py linker with the default
environment. Otherwise, you need to create your own environment.
* Fix a crash during optimization when we take a subtensor of a constant with a non constant index. (Ian G.)
* Better handling and error message of gradients on integer. (Ian G.)
* Fixes a crash where Scan assumed all TypeErrors raised by the grad function were due to undefined gradients (Ian G.)

* Doc typo fixes, Doc updates, Better error messages: Olivier D., David W.F., Frederic B., James B., Matthew Rocklin, Ian G., abalkin.

Release Notes

Theano 0.6rc1 (October 1st, 2012)

* Bug fixes, crash fixes, CPU and GPU speed up.
* theano_var.eval({other_var: val[,...]} to simplify the usage of Theano (Ian G.)
* New default linker `cvm`. This is the execution engine that tells what op to run in which order.
It is now implemented in C and enables lazy evaluation of ifelse op.
* Faster theano.function compilation. (Pascal L., Ian G.)
* Big sparse submodule update and documentation of it. (Nicolas Bouchard)
* Use GPU asynchronous functionality (Frederic B.)
* Better Windows support.

Known bugs:
* A few crash cases that will be fixed by the final release.

Bug fixes:
* Outputs of Scan nodes could contain corrupted values: some parts of the
output would be repeated a second time, instead of the correct values.
It happened randomly, and quite infrequently, but the bug has been present
(both in Python and Cython) since April 2011. (Pascal L.)
* In Sparse sandbox, fix the grad of theano.sparse.sandbox.sp.row_scale.
It did not return the right number of elements. (Frederic B.)
* set_subtensor(x[int vector], new_value) when moved to the GPU
was transformed into inc_subtensor on the GPU. Now we have a correct
(but slow) GPU implementation.
Note 1: set_subtensor(x[slice[,...]], new_value) was working correctly
in all cases as well as all inc_subtensor.
Note 2: If your code was affected by the incorrect behavior, we now print
a warning by default (Frederic B.)
* Fixed an issue whereby config values were used as default arguments,
with those defaults then stuck at old values if the config variables were
changed during program execution. (David W-F)
* Fixed many subtle bugs involving mutable default arguments which may have
led to unexpected behaviour, such as objects sharing instance variables
they were not supposed to share. (David W-F)
* Correctly record the GPU device number used when we let the driver select it.
(Frederic B.)
* Min, max with NaN in inputs did not return the right output. (Pascal L.)
* The grad of TensorDot, was returning the wrong shape for some combination of axes.
We now raise NotImplementedError in those cases. (Frederic B.)
* conv2d with subsample >2 returned wrong values. (Pascal L.)
* Fixed when mode==valid, disabled when mode==full
* theano.sparse.CSMGrad op (generated by the grad of CSM) didn't
handle unsorted input correctly and gradient that is sparser
than the input. In that case, a bad result was returned. But this could
happen only when a sparse input of a Theano function was not
sorted. This happens for example with sparse advanced indexing from
scipy. The conclusion is most of time Nan in the graph.
(Yann Dauphin)
* theano.sparse._dot(CSC matrix, dense) optimized version UsmmCSCDense didn't handle
correctly not contiguous inputs/outputs. (Pascal L.)
* Fix a corner case CVM updates case. (Pascal L.)
This happened if the update to a shared variable is itself after optimization.
The CVM was not used by default.
* Fix the view_map of sparse.Transpose and sparse.sandbow.sp.RowScale. (Frederic B.)
This probably didn't cause problem as there is only the UsmmCscDense op
(used call to Usmm with CSC matrix) that could interfere with them.

* Deprecated the Module class (Ian G.)
This was a predecessor of SharedVariable with a less pythonic philosophy.

Interface changes:
* Now the base version requirements are numpy >= 1.5.0 and the optional scipy >= 0.7.2.
* In Theano 0.5, we removed the deprecated sharedvar.value property.
Now we raise an error if you access it. (Frederic B.)
* theano.function does not accept duplicate inputs, so function([x, x], ...)
does not work anymore. (Pascal L.)
* theano.function now raises an error if some of the provided inputs are
not part of the computational graph needed to compute the output, for
instance, function([x, y], [y]). You can use the kwarg
``on_unused_input={'raise', 'warn', 'ignore'}`` to control this.
(Pascal L.)
* New Theano flag "on_unused_input" that defines the default value of the
previous point. (Frederic B.)
* tensor.alloc() now raises an error during graph build time
when we try to create less dimensions than the number of dimensions
the provided value have. In the past, the error was at run time.
(Frederic B.)
* Remove theano.Value and related stuff (Ian G.)
This was a test of what ended up as SharedVariable.
* Renamed Env to FunctionGraph, and object attribute "env" to "fgraph" (Ian G.)
Deprecation warning printed when you try to access the "env" attribute.
* Renamed the FunctionGraph.nodes attribute to FunctionNodes.apply_nodes (Ian G.)
* Warn when we don't handle correctly the parameter in Theano flags `nvcc.flags`
(Frederic B.)
* Do not reorder the user flags passed to the compiler. They get set after other flags. (Frederic B.)
* Make setuptools optional (Ilan Schnell)
* We warn when a user tries to use an old GPU with which Theano is untested.
This could cause crash and will also be very slow. (Frederic B.)
* Make theano.grad able to differentiate between not implemented, undefined and disconnected grad.
Op.grad function should return theano.gradient.{grad_not_implemented,grad_undefined} or
something of DisconectedType (Ian G.)
* Make theano.grad expect to always receive a float or undefined
gradient and enforce that op with integer output values always
return 0. (Ian G.)

New memory output contract (was mentioned in the release notes of Theano 0.5):
* Now the output memory received can be preallocated by other stuff.
In the past it was always the previous output an Apply node allocated.
So this means that the shape and strides can be different from previous calls
and there can be links to this memory at other places.
This means it could receive preallocated output that is not c_contiguous.
But we don't do that now. (Pascal L.)
* New Theano flags to test this DebugMode.check_preallocated_output (Pascal L.)
* Updated a few ops to respect this contract (Pascal L.)

New Features:
* GPU scan now works (does not crash) when there is a mixture of float32 and other dtypes.
* theano_var.eval({other_var:val[,...]} to simplify the usage of Theano (Ian G.)
* debugprint new param ids=["CHAR", "id", "int", ""]
This makes the identifier printed to be a unique char, the Python id, a
unique int, or not have it printed. We changed the default to be "CHAR"
as this is more readable. (Frederic B.)
* debugprint new param stop_on_name=[False, True]. If True, we don't print
anything below an intermediate variable that has a name. Defaults to False.
(Frederic B.)
* debugprint does not print anymore the "|" symbol in a column after the last input. (Frederic B.)
* If you use Enthought Python Distribution (EPD) now we use its blas
implementation by default. (Frederic B., Graham Taylor, Simon McGregor)
* MRG random now raises an error with a clear message when the passed shape
contains dimensions with bad value like 0. (Frederic B. reported by Ian G.)
* "CudaNdarray[*] = ndarray" works in more cases (Frederic B.)
* "CudaNdarray[*] += ndarray" works in more cases (Frederic B.)
* We add dimensions to CudaNdarray to automatically broadcast more frequently.
(Frederic B.)
* New theano flag cmodule.warn_no_version. Default False. If True,
will print a warning when compiling one or more Op with C code that
can't be cached because there is no c_code_cache_version() function
associated to at least one of those Ops. (Frederic B.)
* CPU alloc now always generate C code (Pascal L.)
* New Theano flag cmodule.warn_no_version=False. When True, warn when an op
with C code is not versioned (which forces to recompile it everytimes).
(Frederic B.)
* C code reuses preallocated outputs (only done by Scan) (Pascal L.)
* Garbage collection of intermediate results during Theano function calls
for Ops with C code (Pascal L.)
* Theano flag compiledir_format now supports the parameter "numpy_version" and "g++". (Frederic B.)
* Theano GPU variables, shared variables and constants now support <, <=,
> and >= similar to those not on the GPU.
* AdvancedIncSubtensor now supports the set_instead_of_inc parameter. (Eric L.)
* Added Advanced Indexing support to inc_subtensor and set_subtensor. (Eric L.)
* theano.tensor.{any,all,std,var,mean,prod,sum,argmin,argmax,min,max,max_and_argman}
have a new parameter keepdims (Eric L.)
This allows to broadcast it correctly against the input data to normalize it.
* The Updates objects now check that the keys are SharedVariable when we pass them
in the __init__ function. (Pascal L.)
* Set a Theano Variable name on transposed op when the input has one (Frederic B).
* The cvm linker now supports garbage collection (enabled by default). (James B. Arnaud B., Pascal L.)
* The cvm linker is now the default linker.
This makes the "loop" around the execution of apply node in C. So this lowers the overhead.
* theano_variable[numpy.newaxis] is now supported (James B.)
* Enable ifelse on the GPU. (Frederic B.)
* Correctly support numpy.memmap everywhere (Pascal L.)
We add partial support for them before. Just use the normal tensor operation
on them and it should work.
But be careful not to exhaust your computer memory! (we always generate normal ndarray)
* Add an optimization that stabilizes log(softmax(x)). (Ian G.)
* Re-enable the Images2Neibs grad. It was not broken, the problem was how we tested it. (Frederic B.)
* If `theano_fn.trust_input` is set to False, do not check if the inputs are good
when calling the theano function. (Frederic B.)
* Add theano.tensor.blas,gem{m,v} as shortcut.
* theano.grad(..., add_names=True). False for the old
behavior. Otherwise it tries to name the grad variables. (Ian G.)
* theano-nose (Pascal L.)
A wrapper around nosetests that adds needed extensions.
* --profile-time option, to print time spent in each test (Eric L.)
* --batch option, to allow to run tests in batch to lower memory requirement.
* m = mean(log(1 - sigm(x)))
x - scalar * theano.grad(m, x)
There is a stabilization optimization for this.
Now it is applied more frequently. (Pascal L.)

New Op/functions:
* Added element-wise operation theano.tensor.{GammaLn,Psi} (John Salvatier, Nicolas Bouchard)
* Added element-wise operation theano.tensor.{arcsin,arctan,arccosh,arcsinh,arctanh,exp2,arctan2} (Nicolas Bouchard)
* Added element-wise operation theano.tensor.{gamma,conj,complex_from_polar,expm1,deg2rad,rad2deg,trunc,gamma} (Nicolas Bouchard)
* Added theano.tensor.argsort that wraps numpy.argsort (Hani Almousli).
* Added theano.tensor.diff that wraps numpy.diff (Nicolas B.)
* Added theano.tensor.bincount that wraps numpy.bincount (Nicolas B., Pascal L, Frederic B.)
* Added theano.tensor.squeeze (Nicolas B.)
This removes broadcasted dimensions from the variable.
Theano-esque version of numpy.squeeze.
* Added theano.tensor.repeat that wraps numpy.repeat (Nicolas B. + PL)
* Added theano.tensor.bartlett that wraps numpy.bartlett (Eric L.)
* Added theano.tensor.fill_diagonal that wraps numpy.fill_diagonal (Eric L., Frederic B.)
* Added tensor.square that is an alias for tensor.sqr as NumPy (Ian G.)
* Added theano.tensor.load(path, dtype, broadcastable, mmap_mode=None) op
that allows to load a .npy file in a theano graph (Matthew Rocklin)
* op. (Eric L.)
Kronecker product

Speed up:
* CPU convolutions are now parallelized (Frederic B.)
By default use all cores/hyper-threads.
To control it, use the `OMP_NUM_THREADS=N` environment variable where N is the number of
parallel threads to use. By default it is equal to the number of CPU cores/hyper
threads that you have.
There is a new Theano flag `openmp` to allow/disallow openmp op.
If your BLAS library is parallelized, this flag won't affect it, but the
env variable will.
* Remove a corner case causing duplicated dot22/gemm in the graph. (Frederic B., Ian G.)
* Enable fusion of elemwise that have the same clients multiple times. (Frederic B.)
* New optimization: Remove reduction over broadcastable dimensions (James B., Frederic B.)
* Faster theano.function compilation. (Pascal L., Ian G.)
* Remove GPU transfer around specify_shape op. (Frederic B.)
* Implemented/tested MANY op.infer_shape method (Eric Larsen)
This allows Theano to make better shape inferance.
* Implement Solve.infer_shape (Matthew Rocklin)
* Scan memory optimizations now work more frequently. (Razvan P.)
There was a warning printed by the subtensor optimization in those cases.
* Faster rng_mrg Python code. (mostly used for tests) (Frederic B.)

Speed up GPU:
* Convolution on the GPU now checks the generation of the card to make
it faster in some cases (especially medium/big ouput image) (Frederic B.)
* We had hardcoded 512 as the maximum number of threads per block. Newer cards
support up to 1024 threads per block.
* Faster GpuAdvancedSubtensor1, GpuSubtensor, GpuAlloc (Frederic B.)
* We now pass the GPU architecture to nvcc when compiling (Frederic B.)
* Now we use the GPU function async feature by default. (Frederic B.)
Set the environment variable `CUDA_LAUNCH_BLOCKING` to `1` to disable this
for profiling or debugging.
* Faster creation of CudaNdarray objects (Frederic B.)
* Now some Max reductions are implemented on the GPU. (Ian G.)

Sparse Sandbox graduate (moved from theano.sparse.sandbox.sp):
* sparse.remove0 (Frederic B., Nicolas B.)
* sparse.sp_sum(a, axis=None) (Nicolas B.)
* bugfix: the not structured grad was returning a structured grad.
* sparse.{col_scale,row_scale,ensure_sorted_indices,clean} (Nicolas B.)
* sparse.{diag,square_diagonal} (Nicolas B.)

* Support for uint* dtype.
* Implement theano.sparse.mul(sparse1, sparse2) when both inputs don't
have the same sparsity pattern. (Frederic B.)
* New Ops: sparse.{expm1,deg2rad,rad2deg,trunc} (Nicolas B.)
* New Ops: sparse.{sqrt,sqr,log1p,floor,ceil,sgn,round_half_to_even} (Nicolas B.)
* New Ops: sparse.{arctanh,tanh,arcsinh,sinh,arctan,arcsin,tan,sin} (Nicolas B.)
* New functions: structured_{add,exp,log,pow,minimum,maximum,sigmoid} (Yann D., Nicolas B.)
* Optimized op: StructuredAddSV, StrucutedAddSVCSR (inserted automatically)
* New Op: sparse.mul_s_v multiplication of sparse matrix by broadcasted vector (Yann D.)
* New Op: sparse.Cast() (Yann D., Nicolas B.)
* Add sparse_variable.astype() and theano.sparse.cast() and
theano.sparse.{b,w,i,l,f,d,c,z}cast() as their tensor equivalent (Nicolas B.)
* Op class: SamplingDot (Yann D., Nicolas B.)
* Optimized version: SamplingDotCsr, StructuredDotCSC
* Optimizations to insert the optimized version: local_sampling_dot_csr, local_structured_add_s_v
* New Ops: sparse.{Multinomial,Poisson,Binomial} (Yann D., NB)
* Implement the CSMProperties grad method (Yann Dauphin)
* Move optimizations to theano/sparse/ (Nicolas B.)

New flags:
* `profile=True` flag now prints the sum of all printed profiles. (Frederic B.)
* It works with the linkers vm/cvm (default).
* Also print compile time, optimizer time and linker time.
* Also print a summary by op class.
* new flag "profile_optimizer" (Frederic B.)
when profile=True, will also print the time spent in each optimizer.
Useful to find optimization bottleneck.
* new flag "cmodule.remove_gxx_opt" (Frederic B.)
If True, will remove -O* parameter passed to g++.
This is useful to debug in gdb module compiled by Theano.
The parameter -g is passed by default to g++.
* new flag cmodule.compilation_warning
if True, will print compilation warning.
* new flag `allow_gc` (Frederic B.)
When False, do not garbage collect intermediate results when they are not needed.
This uses more memory, but allocates memory less frequently so faster.
* new flag `vm.lazy` (Frederic B.)
Useful only for the vm linkers. When lazy is None,
auto detect if lazy evaluation is needed and use the apropriate
version. If lazy is True/False, force the version used between
Loop/LoopGC and Stack.
* new flag `cxx`. This is the C++ compiler to use. If empty do not compile C code. (Frederic B.)
* New flag `print_active_device` that defaults to True. (Matthew R.)

* Added in the tutorial documentation on how to extend Theano.
This explains how to make a Theano Op from a Python function.
(Frederic B.)
* New installation instructions for Windows using EPD (Pascal L.)
* New installation on Windows by using a Linux VM from ContinuumIO (Frederic B.)
* Revisions of Theano tutorial and addition of exercices to it. (Eric L.)
* New tutorial on Sparse variable. (Nicolas B., Sebastien Lemieux, Frederic Bastien
* Installation documentation for CentOS6 (Frederic B.)
* Installation documentation for Ubuntu (with GPU) (Frederic B., Matthias Zoehrer)
* Doc typo fixes, Doc updates, Better error messages: Olivier D., David W.F., Frederic B., James B., Matthew Rocklin, Ian G.
* Python Memory Management tutorial (Steven Pigeon, Olivier D.)

* Math framework for complex gradients (Pascal L.)

Internal changes:
* Define new exceptions MissingInputError and UnusedInputError, and use them
in theano.function, instead of TypeError and ValueError. (Pascal L.)
* Better handling of bitwidth and max values of integers and pointers
across platforms (Pascal L.)
* Made a few Ops with C code versioned to reduce compilation time.
(Frederic B, Pascal L.)
* Better deletion of files in the compiledir (Frederic B.)
* Safer import on sort op (Nicolas Pinto)
* hash_from_dict for elemwise op (Fredric B.)
* Renamed BadCLinkerOutput into BadThunkOutput. (PL)
* tensor.utils.shape_of_variables (Matthew R.)
* Add the numpy abi version and g++/nvcc version in the key of compiled code. (Frederic B.)
* env.replace_all_validate_remove (Frederic B.)
This allows global optimizer to ensure it removed some nodes from the graph.
This is a generic way to catch errors that would otherwise duplicate
* It was used for GEMM and Scan optimization (Frederic B., Razvan P.)
* Fix how exception are raised in GPU code (James B.)
* Made code respect pep8: OD, Fred, Pascal L., Nicolas Bouchard, Eric Larsen and others.
* TensorType and CudaNdarrayType now have a value_zeros method that call CudaNdarray.zeros or
numpy.zeros with the right dtype. (Pascal L., Olivier D.)
This allows to have the same code work with both types.
* Renamed FunctionGraph.extend function to FunctionGraph.attach_feature. (Ian G.)
* New exception MissingGXX when we try to compile but there is no cxx compiler. (Frederic B.)
* New fct theano.gof.utils.give_variables_names(...) that gives unique names to variables. (Matthew R.)
* Use most of the time the new NumPy C-API for later NumPy release. (Frederic B.)
* New theano.gof.sched.sort_apply_nodes() that will allow other execution ordering. (Matthew R.)
* New attribute sort_schedule_fn, a way to specify a scheduler to use. (Matthew R.)

Crash Fix:
* Fix import conflict name (usaar33, Frederic B.)
* This makes Theano work with PiCloud.
* Do not try to use the BLAS library when blas.ldflags is manually set to an
empty string (Frederic B., Pascal L.)
* When importing theano on a computer without GPU with the Theano
flags 'device' or 'init_gpu_device' set to gpu* (Frederic B., reported by Luo Heng)
* Optimization printed a useless error when scipy was not available. (Frederic B.)
* GPU conv crash/slowdown on newer hardware (James B.)
* Better error handling in GPU conv (Frederic B.)
* GPU optimization that moves element-wise Ops to the GPU. Crash happened in
a particular execution order of this optimization and the
element-wise fusion optimization when upcasting some inputs to
float32 (to compute them on the GPU).
(Frederic B., reported by Sander Dieleman)
* GpuReshape in some particular case when the input is not contiguous
(Frederic B., reported by Sander Dieleman)
* GpuSoftmaxWithBias with shape (0, N) with N > 1.
(Frederic B., reported by Razvan P.)
* Fix crash under 64-bit Windows, when taking subtensors of the form a[n:]
(Pascal L., reported by Simon McGregor)
* Fixed issue with the MaxAndArgmax Op not properly preserving broadcastable
dimensions, which could typically result in optimization crashes (Olivier D.)
* Fixed crash when concatenating some arrays with specific broadcasting
patterns (Olivier D.)
* Work around a known issue with nvcc 4.1 on MacOS X. (Graham Taylor)
* In advanced indexing, if some inputs are constant, no need to call constant(...)
on their value any more. (Pascal L., reported by John Salvatier)
* Fix crash on GPU when the GpuSubtensor didn't put the right stride
when the result tensor had a dimension with size of 1. (Pascal L,
reported Graham T.)
* Fix scan crash that made it not run on the GPU in one case. (Guillaume D.)
* If you grad again a random state, don't crash (Razvan P.)
* GpuDownsampleFactorMax and its grad with inputs dimensions 0 and 1 bigger then 65535.
(Frederic B. reported by Gabe Schwartz)
* Potential crash due to parallel compilation when importing theano.sandbox.cuda
(Olivier D.)
* Crash fix on python 2.4 with slicing. (Pascal L.)
* grad of argmin and argmax (Razvan P.)
* Don't compute the Rop for shared variables with updates (mostly random).
We don't use them and they caused crash. (Razvan P.)
* MaxArgmax.grad() when one of the gradient it receives is None. (Razvan P, reported by Mark Fenner)
* Fix crash of GpuSum when some dimensions shape was 0. (Frederic B.)

* Use less memory (Olivier D.) (fix crash on 32-bit computers)
* Fix test with Theano flag "blas.ldflags=". (Frederic B., Pascal L.)
* Fix crash with advanced subtensor and numpy constant.
* Fix random tests crash due to random value. (Pascal L.)
* Always introduce Alloc node when calling alloc and let the optimizer remove them if needed.
This allows DebugMode to catch some shape error. (Pascal L.)
* DebugMode now checks the view_map for all types of Theano variables.
It was doing only variables of tensor type. (Frederic B.)

* Remove python warning for some python version. (Gabe Schwartz)
* Remove useless fill op in fast_compile mode to make the graph more readable. (Fredric B.)
* Remove GpuOuter as it is a subset of the new GpuGer (Frederic B.)
* Now we use to run all CPU tests (without SciPy)
with the default mode on all Pull Requests.
This should make the trunk more stable. (Fredric B.)
* Our nightly buildbot now checks on python 2.4 (Frederic B.)
This should make the trunk work on it more frequently.

Other thanks:
* blaxill reported an error introduced into the trunk.

New stuff that will probably be reworked/removed before the release:
* Better PyCUDA sharing of the GPU context.(fix crash at exit) (Frederic B.)
TODO: there is still a crash at exit!

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for Theano, version 0.6.0rc2
Filename, size File type Python version Upload date Hashes
Filename, size Theano-0.6.0rc2.tar.gz (1.5 MB) File type Source Python version None Upload date Hashes View
Filename, size (1.7 MB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page