Getting wrong outputs on TX2 with PyTorch compiled from source

ankitdhall · May 15, 2018, 2:11pm

Hello,

I trained a custom CNN and wanted to deploy it on the TX2. It works perfectly on 3 different systems (not ARM based). However, with the same weight file, input and source code, it gives outputs that are larger than expected.
Initially, I thought it was a version issue, however, I built another version from source (v0.3.0 and v0.3.1) and both give unexpected output values. Also, toggling the GPU-CPU flag does not help. I built the same version also on another system (from source) and it works fine.

Is there something that I am missing here?

Thanks for your time.

AastaLLL · May 16, 2018, 6:03am

Hi,

Could you share more information about your model?

Not sure if this is relevant to the incorrect GPU architecture.
Could you check following pyTorch installation script first?
[url]https://gist.github.com/dusty-nv/ef2b372301c00c0a9d3203e42fd83426[/url]

Thanks.

ankitdhall · May 19, 2018, 7:40pm

Yes, what more would you like to know about the system?

I followed the procedure in the script. I ran

python setup.py install

i.e. “install mode” and not “develop mode”. Will that make a difference?

Thanks.

AastaLLL · May 21, 2018, 8:31am

Hi,

Could you share the model you used?
We want to check if there is any particular operation that yields accuracy issue.

Thanks.

ankitdhall · May 21, 2018, 1:02pm

Hi,

So I use a custom neural network. For starters it uses the pytorch resnet ([url]https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py[/url]) to load the part of the resnet. I have attached the exact version of the file.

I have tried it with PyTorch v0.3.0 and v0.3.1, and it does not work. I thought it might be an architecture issue so I actually trained on the Jetson and then evaluated it there but still got similar results.

Regards,
Ankit

AastaLLL · May 24, 2018, 6:30am

Hi,

Could you run the verification sample in dusty-nv tutorial first?

If possible, please provide an example code to help us reproduce this issue.
Maybe we can reproduce it by comparing the results of an x86-machine and Jetson TX2?

Thanks.

ankitdhall · May 25, 2018, 8:01am

Hi,

So running the commands from “Verify PyTorch” section, I get reasonable outputs:

nvidia@tegra-ubuntu:~$ python
Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
0.3.0b0+af3964a
>>> print('CUDA available: ' + str(torch.cuda.is_available()))
CUDA available: True
>>> a = torch.cuda.FloatTensor(2).zero_()
>>> print('Tensor a = ' + str(a))
Tensor a = 
 0
 0
[torch.cuda.FloatTensor of size 2 (GPU 0)]

>>> b = torch.randn(2).cuda()
>>> print('Tensor b = ' + str(b))
Tensor b = 
 0.8007
 2.0221
[torch.cuda.FloatTensor of size 2 (GPU 0)]

>>>  c = a + b
  File "<stdin>", line 1
    c = a + b
    ^
IndentationError: unexpected indent
>>> c = a + b
>>> print('Tensor c = ' + str(c))
Tensor c = 
 0.8007
 2.0221
[torch.cuda.FloatTensor of size 2 (GPU 0)]

I am guessing it happens during the inference. Where can I attach the weights file and the inference scripts? Would it be possible to send it via email? Or do you suggest a better way to share the code with you?

Thanks for your time.

AastaLLL · May 25, 2018, 8:12am

Hi,

You can send it via private message.
Thanks.

ankitdhall · May 31, 2018, 10:32am

Hi,

I figured out what the issue was. The input to PyTorch is somehow between 0-255 on the TX2 while on 2 other laptops it was between 0-1 even though I do not use any normalization explicitly in my code and the code that ran on the machines was the same. After performing a division by 255 on the TX2, it works as expected.

It seems like there is an issue with libraries but I am not sure what caused it in the first place.

cheers.

AastaLLL · June 4, 2018, 8:35am

Hi,

Thanks for sharing this information with us!!!

Topic		Replies	Views
TX2 GPU obsolete for Pytorch? Jetson TX2 pytorch	6	2146	November 2, 2020
Pytorch with jetpack 4.2 works slowly than 3.3 Jetson TX2	5	1478	April 26, 2019
Jetson TX2: Pytorch install problem Jetson TX2	15	8280	November 26, 2019
tensorflow inference result is error Jetson TX2	6	756	March 11, 2019
Compiled pytorch error in jetson xavier agx Jetson AGX Xavier pytorch , compile	7	1054	May 27, 2021
Wrong inference result on the TX2 Jetson TX2	4	905	February 21, 2019
What's with the old/broken version of PyTorch for Orin? Jetson Orin NX cuda	1	800	May 12, 2023
Pytorch compatibility issues (torch 2.0.0+nv23.5 && torchvision 0.15.1) Jetson Orin NX pytorch	9	18910	June 13, 2023
Running PyTorch on Jetson TX2 Jetson TX2 pytorch	9	1567	February 27, 2024
run tensorflow on Jetson TX2 with GPU out wrong result Jetson TX2	4	852	November 30, 2018

Getting wrong outputs on TX2 with PyTorch compiled from source

Related topics