get stuck running deep learning model on Jetson TX2

Hi, I want to run the deep learning model on Jeson tx2.

I installed pytorch without problems and cuda 9.0 with cudnn 7.0
ex) i checked by ‘import torch’ , i checked by ‘nvcc --version’
but When I try to run erfnet code, I got stuck

“RuntimeError : cuda runtime error(7) : too many resources requested for launch at /home/nvidia/pytorch/aten/src/THCUNN/im2col.h”

please help me!.

Hi,

This error usually occurs when running out of memory.

Could you try to reboot the TX2 and try it again?
Installation may hold some memory resource and can be released via restart.

Thanks.

thank you for your reply

when I did reboot, I still can not run the model.

$free m
total : 8032548
used : 939124
free : 6500083
shared : 28992
buff/cache : 593340
available : 6982456

I really need your help

Switch to topic 1032112 for future update.
https://devtalk.nvidia.com/default/topic/1032112

This problem has been fixed if you’re still interested.

This has to do with CUDA 9.0 attempting to allocate more registers to each thread. This can be fixed by setting a launch bound on the cuda kernels in im2col.h. You should be able to just pull the latest pytorch version and re-install it and it would work.

For more details look at this thread:

Great! Thanks for sharing.