Troubleshooting
- Why do I get a network error when I install Theano
- How to solve TypeError: object of type ‘TensorVariable’ has no len()
- How to test that Theano works properly
If you are behind a proxy, you must do some extra configuration stepsbefore starting the installation. You must set the environmentvariable to the proxy address. Using bash this isaccomplished with the commandexport http_proxy="http://user:pass@my.site:port/"
You can also provide the —proxy=[user:pass@]url:port
parameterto pip. The [user:pass@]
portion is optional.
How to solve TypeError: object of type ‘TensorVariable’ has no len()
If you receive the following error, it is because the Python function len cannotbe implemented on Theano variables:
Python requires that len returns an integer, yet it cannot be done as Theano’s variables are symbolic. However, var.shape[0] can be used as a workaround.
This error message cannot be made more explicit because the relevant aspects of Python’sinternals cannot be modified.
Occasionally Theano may fail to allocate memory when there appears to be morethan enough reporting:
where X is far less than Y and Z (i.e. X << Y < Z).
This scenario arises when an operation requires allocation of a large contiguousblock of memory but no blocks of sufficient size are available.
GPUs do not have virtual memory and as such all allocations must be assigned toa continuous memory region. CPUs do not have this limitation because or theirsupport for virtual memory. Multiple allocations on a GPU can result in memoryfragmentation which can makes it more difficult to find contiguous regionsof memory of sufficient size during subsequent memory allocations.
A known example is related to writing data to shared variables. When updating ashared variable Theano will allocate new space if the size of the data does notmatch the size of the space already assigned to the variable. This can lead tomemory fragmentation which means that a continugous block of memory ofsufficient capacity may not be available even if the free memory overall islarge enough.
theano.function returns a float64 when the inputs are float32 and int{32, 64}
It should be noted that using float32 and int{32, 64} togetherinside a function would provide float64 as output.
Since the GPU can’t compute this kind of output, it would bepreferable not to use those dtypes together.
To help you find where float64 are created, see thewarn_float64
Theano flag.
An easy way to check something that could be wrong is by making sure THEANO_FLAGS
have the desired values as well as the ~/.theanorc
Also, check the following outputs :
- ipython
Once you have installed Theano, you should run the test suite.
- python -c "import numpy; numpy.test()"
- python -c "import scipy; scipy.test()"
- THEANO_FLAGS=''; python -c "import theano; theano.test()"
All Theano tests should pass (skipped tests and known failures are normal). Ifsome test fails on your machine, you are encouraged to tell us what wentwrong on the theano-users@googlegroups.com
mailing list.
Theano’s test should NOT be run with device=cuda
or they will fail. The tests automatically use the gpu, if any, whenneeded. If you don’t want Theano to ever use the gpu when running tests,you can set to cpu
andconfig.force_device
to True
.
Why is my code so slow/uses so much memory
There is a few things you can easily do to change the trade-offbetween speed and memory usage. It nothing is said, this affect theCPU and GPU memory usage.
Could speed up and lower memory usage:
- cuDNN default cuDNN convolution use less
- memory then Theano version. But some flags allow it to use morememory. GPU only.
Could raise memory usage but speed up computation:
config.gpuarray.preallocate
= 1 # Preallocates the GPU memoryand then manages it in a smart way. Does not raise much the memoryusage, but if you are at the limit of GPU memory available you mightneed to specify a lower value. GPU only.- =False
config.optimizer_excluding
=low_memory , GPU only for now.
Could lower the memory usage, but raise computation time:
config.scan.allow_gc
= True # Probably not significant slowdown on the GPU if memory cache is not disabledUse
batch_normalization()
. It use less memorythen building a corresponding Theano graph.- Disable one or scan more optimizations:
optimizer_excluding=scan_pushout_dot1
optimizer_excluding=scanOp_pushout_output
Disable all optimization tagged as raising memory usage:
optimizer_excluding=more_mem
(currently only the 3 scan optimizations above).
If you want to analyze the memory usage during computation, thesimplest is to let the memory error happen during Theano execution anduse the Theano flags exception_verbosity=high
.
There are many ways to configure BLAS for Theano. This is done with the Theanoflags blas.ldflags
(config – Theano Configuration). The default is to use the BLASinstallation information in NumPy, accessible vianumpy.distutils.config.show()
. You can tell theano to use a differentversion of BLAS, in case you did not compile NumPy with a fast BLAS or if NumPywas compiled with a static library of BLAS (the latter is not supported inTheano).
The short way to configure the Theano flags blas.ldflags
is by setting theenvironment variable to blas.ldflags=XXX
(in bashexport THEANO_FLAGS=blas.ldflags=XXX
)
The ${HOME}/.theanorc
file is the simplest way to set a relativelypermanent option like this one. Add a [blas]
section with an ldflags
entry like this:
For more information on the formatting of ~/.theanorc
and theconfiguration options that you can put there, see config – Theano Configuration.
Here are some different way to configure BLAS:
1) Disable the usage of BLAS and fall back on NumPy for dot products. To dothis, set the value of blas.ldflags
as the empty string (ex: export
THEANO_FLAGS=blas.ldflags=
). Depending on the kind of matrix operations yourTheano code performs, this might slow some things down (vs. linking with BLASdirectly).
2) You can install the default (reference) version of BLAS if the NumPy version(against which Theano links) does not work. If you have root or sudo access infedora you can do sudo yum install blas blas-devel
. Under Ubuntu/Debiansudo apt-get install libblas-dev
. Then use the Theano flags. Note that the default version of blas is not optimized.Using an optimized version can give up to 10x speedups in the BLAS functionsthat we use.
3) Install the ATLAS library. ATLAS is an open source optimized version ofBLAS. You can install a precompiled version on most OSes, but if you’re willingto invest the time, you can compile it to have a faster version (we have seenspeed-ups of up to 3x, especially on more recent computers, against theprecompiled one). On Fedora, sudo yum install atlas-devel
. Under Ubuntu,sudo apt-get install libatlas-base-dev libatlas-base
orlibatlas3gf-sse2
if your CPU supports SSE2 instructions. Then set theTheano flags blas.ldflags
to -lf77blas -latlas -lgfortran
. Note thatthese flags are sometimes OS-dependent.
4) Use a faster version like MKL, GOTO, … You are on your own to install it.See the doc of that software and set the Theano flags blas.ldflags
correctly (for example, for MKL this might be -lmkl -lguide -lpthread
or-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -liomp5 -lmkl_mc
-lpthread
).
Note
Make sure your BLASlibraries are available as dynamically-loadable libraries.ATLAS is often installed only as a static library. Theano is not able touse this static library. Your ATLAS installation might need to be modifiedto provide dynamically loadable libraries. (On Linux thistypically means a library whose name ends with .so. On Windows this will bea .dll, and on OS-X it might be either a .dylib or a .so.)
This might be just a problem with the way Theano passes compilationarguments to g++, but the problem is not fixed yet.
Note
If you have problems linking with MKL, and the MKL User Guidecan help you find the correct flags to use.
Note
If you have error that contain “gfortran” in it, like this one:
The problem is probably that NumPy is linked with a different blasthen then one currently available (probably ATLAS). There is 2possible fixes:
- Uninstall ATLAS and install OpenBLAS.
- Use the Theano flag “blas.ldflags=-lblas -lgfortran” 1) is better as OpenBLAS is faster then ATLAS and NumPy isprobably already linked with it. So you won’t need any otherchange in Theano files or Theano configuration.
It is recommended to test your Theano/BLAS integration. There are many versionsof BLAS that exist and there can be up to 10x speed difference between them.Also, having Theano link directly against BLAS instead of using NumPy/SciPy asan intermediate layer reduces the computational overhead. This isimportant for BLAS calls to ger
, gemv
and small gemm
operations(automatically called when needed when you use dot()
). To run theTheano/BLAS speed test:
- python `python -c "import os, theano; print os.path.dirname(theano.__file__)"`/misc/check_blas.py
This will print a table with different versions of BLAS/numbers ofthreads on multiple CPUs and GPUs. It will also print some Theano/NumPyconfiguration information. Then, it will print the running time of the samebenchmarks for your installation. Try to find a CPU similar to yours inthe table, and check that the single-threaded timings are roughly the same.
Theano should link to a parallel version of Blas and use all coreswhen possible. By default it should use all cores. Set the environmentvariable “OMP_NUM_THREADS=N” to specify to use N threads.
Mac OS
Although the above steps should be enough, running Theano on a Mac maysometimes cause unexpected crashes, typically due to multiple versions ofPython or other system libraries. If you encounter such problems, you maytry the following.
- You can ensure MacPorts shared libraries are given priority at run-timewith
export LD_LIBRARY_PATH=/opt/local/lib:$LD_LIBRARY_PATH
. In orderto do the same at compile time, you can add to your~/.theanorc
: