Jetson Tx1 pytorch

durmushalil · January 31, 2018, 9:53am

Hello, I have fresh installed Jetpack 3.1 with 28.1, CUDA and Cudnn. Opencv and others are built without errors. My problem is when I use this
[url]https://gist.github.com/dusty-nv/ef2b372301c00c0a9d3203e42fd83426[/url]
install procedure, “python setup.py develop” command freezes jetson then it gives segmentation error, I have opened the system monitor and saw one of the build step RAM overflows.
Any idea what to do?

durmushalil · January 31, 2018, 12:55pm

Hi again, today after various tests (build error, change, build error, …) pytorch built successfully on python3. [url]https://github.com/andrewadare/jetson-tx2-pytorch[/url] I used this link. I have changed the cmake version I don’t know this is the problem or not. But I think I have solved it, I will also try it with python2.

AastaLLL · February 1, 2018, 3:31am

Thanks for the update and hope to hear the results of python2.
Thanks.

durmushalil · February 1, 2018, 11:07am

Hello my update is, python2 build also ran out of memory so I rebuild kernel with swap enabled. Then pytorch compiled very well. Actually I don’t get it why you didn’t activated it in the first place. Now my problem is old version of pytorch installed whatever I do. Installed version is 0.1.10+ac9245a but with git downloads version 0.4.0a0.

If you anyone have successfully installed pytorch can you share your version?

dusty_nv · February 1, 2018, 4:59pm

Hi durmushalil, I have this repo building against pyTorch v0.3.0 on TX2 (no swap necessary):

[url]https://github.com/dusty-nv/jetson-reinforcement/blob/master/CMakePreBuild.sh[/url]

pyTorch master (v0.4.0+) builds too, but pyTorch keeps their tutorials/samples updated against their latest binary release (which is v0.3.0 currently), so to maintain compatibility with majority of pyTorch scripts, I checkout v0.3.0 in the script above.

TTL · May 18, 2018, 3:57am

Hi,
I meet trouble when i run “python setup.py develop” command :
"
[ 29%] Building NVCC (Device) object CMakeFiles/THC.dir/THC_generated_THCTensorMode.cu.o
/home/ubuntu/pytorch/torch/lib/THC/THCNumerics.cuh(38): warning: integer conversion resulted in a change of sign

Killed
CMake Error at THC_generated_THCTensorIndex.cu.o.cmake:267 (message):
Error generating file
/home/ubuntu/pytorch/torch/lib/build/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorIndex.cu.o

"
what should i do?

If i use Docker for pyTorch ,is it work?

dusty_nv · May 18, 2018, 2:53pm

@TTL, if you are building on TX1, you probably ran out of memory while compiling and need to enable SWAP.

TTL · May 18, 2018, 3:35pm

Thank you!@dusty_nv. So how can i enable SWAP on TX1?

dusty_nv · May 18, 2018, 8:30pm

If the kernel version you are using needs swap enabled, see this thread: https://devtalk.nvidia.com/default/topic/916777/?comment=4807307

Then after attaching external storage (ideally via SATA or PCIe), create a SWAP partition and mount it like so: https://help.ubuntu.com/community/SwapFaq#How_do_I_add_or_modify_a_swap_partition.3F

durmushalil · May 18, 2018, 8:52pm

I have used jetsonhacks tutorials. He even shows how to build the kernel with swap. It is easy to do, but my advise is use jetpack 3.2, it has swap enabled. Also pytorch supports cuda 9.0, with 3.2 you can use jetsonhacks swap code, then build your pytorch.

TTL · May 21, 2018, 9:16am

Thanks for your advice @durmushalil,but I can’t use jetpack 3.2.When I run ‘JetPack-L4T-3.2-linux-x64_b196.run’, it is always interrupted by ‘manifest file was broken’, I tried all the solutions on this forum, it still doesn’t work,is there any one can help me ?

dusty_nv · May 21, 2018, 2:39pm

What happens if you try a fresh JetPack 3.2 in an empty directory?

Are you behind a network firewall? What geographic region are you downloading from?

TTL · May 22, 2018, 1:51am

When I fresh JetPack 3.2 in an empty directory,it still output same error:‘manifest file is broken’.I am not behind a network firewall.After created Jetson TX1 sawp file, I tried to build pyTorch,it wasn’t interrupted by ‘killed’,but I got an error:
[ 85%] Building NVCC (Device) object src/ATen/CMakeFiles/ATen_cuda.dir/native/cuda/ATen_cuda_generated_TensorFactories.cu.o
In file included from tmpxft_00004a33_00000000-4_SoftMax.cudafe1.stub.c:1:0:
/tmp/tmpxft_00004a33_00000000-4_SoftMax.cudafe1.stub.c:41:17: error: parse error in template argument list
template<> __specialization_static void __wrapper__device_stub_cunn_SoftMaxForward<2, ::at::cuda::type , ::at::acc_type<double, (bool)1> , ::at::native::operator ::LogSoftMaxForwardEpilogue>( _ZN2at4cuda4typeIdEE *&__cuda_0,_ZN2at4cuda4typeIdEE *&__cuda_1,int &__cuda_2){__device_stub__ZN2at6native66_GLOBAL__N__42_tmpxft_00004a33_00000000_7_SoftMax_cpp1_ii_826a462619cunn_SoftMaxForwardILi2EddNS1_25LogSoftMaxForwardEpilogueEEEvPT0_S5_i( (_ZN2at4cuda4typeIdEE *&)__cuda_0,(_ZN2at4cuda4typeIdEE *&)__cuda_1,(int &)__cuda_2);}}}}

What is ‘parse error’? Should I change cmake version? Now its version is 3.11.1

dusty_nv · May 23, 2018, 3:53pm

Please see this post, there are some China-based ISPs issue recently, and DNS issue is still under fixing, please stay tuned.

Which version of PyTorch are you using? I’m able to build and run v0.3.0. Master (v0.4.0) has changes which aren’t totally ironed out yet.

TTL · May 24, 2018, 7:20am

After switching operators, I was able to download some packages by JetPack v3.2 or JetPack v3.1.
I am using PyTorch v0.3,CUDA v8.0 ,cudnn v5.0,cmake v11.1.2,gcc v5.4 or gcc v4.9. Should I change version of cudnn? I have changed the version of them to build pyTorch, except CUDA and CUDNN

dusty_nv · May 24, 2018, 11:41am

With PyTorch v0.3.0 I am using JetPack 3.2 — which comes with CUDA9 and cuDNN 7.0.5.

Here is the build script that I use. It configures this repo that uses PyTorch on Jetson.

TTL · May 24, 2018, 2:54pm

Thank you very much!I will try it! waiting for my good news

TTL · June 4, 2018, 12:02pm

After I had installed jetpack 3.2, pyTorch was well build,but I got another error:

RuntimeError: cuda runtime error (7) : too many resources requested for launch at /home/nvidia/pytorch/torch/lib/THCUNN/generic/SpatialUpSamplingBilinear.cu:63

I know that ‘SpatialUpSamplingBilinear.cu’ without launch_bounds(1024) leads to this error, but I don’t know how to fix it…

AastaLLL · June 8, 2018, 7:02am

Hi,

CUDA error 7 means cudaErrorLaunchOutOfResources.

This error usually indicates that the user has attempted to pass too many arguments to the device kernel, or the kernel launch specifies too many threads for the kernel’s register count.

Could you monitor the system status and share results with us?

sudo ./tegrastats

Thanks.

sojohans · September 17, 2018, 9:19am

Hi Experts

I am running jetpack 3.3 and python3 on my tx1.

Anybody have a link to script that installs pytorch with the above base?

Needed for some reinforcement learning experiments…

sojohans

Topic		Replies	Views
Pytorch on jetson tx1 Jetson TX1	5	1504	October 18, 2021
Install PyTorch with Python 3.8 on Jetpack 4.4.1 Jetson TX2 pytorch , python	14	17985	July 14, 2021
Jetson TX2: Pytorch install problem Jetson TX2	16	8053	October 18, 2021
PyTorch Install with Python3 Broken Jetson TX2	26	10921	October 18, 2021
Installing pytorch - /usr/local/cuda/lib64/libcudnn.so: error adding symbols: File in wrong format collect2: error: ld returned 1 exit status Jetson TX2 pytorch	20	5472	March 11, 2022
Pytorch support Jetson Nano	31	4649	October 18, 2021
Install Pytorch on Jetson TK1 Jetson TK1	7	2440	October 18, 2021
Running PyTorch on Jetson TX2 Jetson TX2 pytorch	10	1267	February 27, 2024
Cant install Pytorch on JetsonNano P3450 Jetson Nano pytorch	21	2568	August 16, 2023
[REQUEST] build script for pytorch or up to date pytorh binary release supporting jetson boards running L4T35.6(ubuntu20.04) Jetson Orin Nano pytorch	20	328	January 28, 2025

Jetson Tx1 pytorch

Related topics