Torch7 on TX1

@jkjung is torch working on your TX1

Aside from the test errors/failures I reported at comment #14, it seems to be working OK.

I just updated my Jetson TX1 with JetPack 2.3 (L4T R24.2 64-bit), with cuda toolkit 8.0. And I built Torch 7 with the latest source code from GitHub. When I ran the test.sh script, torch.test() and nn.test() completed successfully as expected. However I ran into problems with cutorch and cunn again. For example, cutorch.test() finished with 8 errors. All these errors showed a similar log as shown below. I wonder whether it is related to the CUDA ARCH (sm_53) setting or the cuda toolkit…

/home/ubuntu/torch/install/share/lua/5.1/cutorch/test.lua:261: cuda runtime error (11) : invalid argument at /home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensor.cu:34
stack traceback:
......

I gave up using torch on TX1, maybe NVIDIA has to release a pre installed image.

We figured out a couple things here to get Torch7 building again. The CMake script from the repo rolls back the T7 repo a few commits and patches cutorch to build 1 job at a time - although I also had to mount swap.

Dusty, thanks for the reply. I did build cutorch with only 1 job at a time and also enable swap during compilation. I pulled the latest Torch7 from GitHub about 3 days ago and the code compiled without problem.

The thing I’m concerned about is the errors thrown out by cutorch.test(). Please refer to comment #23

@dusty_nv is it possible to release a pre-built image with torch installed ?

Build error

make[2]: *** [lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathPointwiseByte.cu.o] Error 1
Killed
CMake Error at THC_generated_THCTensorMathPointwiseDouble.cu.o.cmake:264 (message):
  Error generating file
  /tmp/luarocks_cutorch-scm-1-6065/cutorch/build/lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathPointwiseDouble.cu.o


make[2]: *** [lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathPointwiseDouble.cu.o] Error 1
cc: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.8/README.Bugs> for instructions.
CMake Error at THC_generated_THCTensorMathCompareTFloat.cu.o.cmake:264 (message):
  Error generating file
  /tmp/luarocks_cutorch-scm-1-6065/cutorch/build/lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathCompareTFloat.cu.o


make[2]: *** [lib/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathCompareTFloat.cu.o] Error 1
make[1]: *** [lib/THC/CMakeFiles/THC.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.

Sure thing, I just released a package for R24.2 here: https://github.com/dusty-nv/jetson-reinforcement/releases/tag/L4T-R24.2-RC1

It is denoted as RC1 because the contents will probably be updated again as Torch commits are made.
The build log you posted with THCTensorMathPointwiseDouble looks like mine before I attached swap.

I do see this error also when running cutorch.test(), however other cutorch/cudnn programs working fine (like DQN solver). I would recommend posting an issue about it to the cutorch GitHub, they may have further understanding of the particular test.

Thanks dusty_nv