JP4.4 production release and PyTorch 1.6rc2 issue

dkreutz · July 16, 2020, 11:17am

I inadvertendly updated my machine from JP4.4-dp “developer preview” to JP4.4-pr “production release”.

This requires me to use PyTorch 1.6rc2 (release candidate) which breaks my application (Mozilla-TTS) with error message:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Same application with same configuration and data on JP4.4-dp and PyTorch 1.5 did not show this error.

Is there any way to a) roll my machine back to JP4.4-dp or b) to compile/install PyTorch 1.5 for JP4.4-pr?

When will Pytorch 1.6 final release be published and will it address my issue?

dusty_nv · July 16, 2020, 3:40pm

Hi @dkreutz, I believe PyTorch 1.6-final is expected sometime next week or the week after, however of course it depends when the PyTorch maintainers release it.

I’m not sure, you may have to check the PyTorch Issues on GitHub or file an issue with them. It’s also unclear if this is a bug or is actually the result of a bug fix.

If Mozilla-TTS is an upstream project, you may want to file an issue with them to test against PyTorch 1.6. It seems that message means that you have a tensor.cuda(), but did not call net.cuda() (or some variation of that, perhaps with multiple tensors). You may want to go through and make sure all the tensors/models are on the GPU.

Unfortunately I don’t believe so, you would probably need to re-flash with the DP (L4T R32.4.2) release.

I wasn’t able to build PyTorch prior to 1.6 for JP 4.4-pr (L4T R32.4.3), because there were cuDNN errors that needed patched. Otherwise I would have provided the 1.5 wheels for JP 4.4-pr as well.

dkreutz · July 17, 2020, 7:08am

Thanks @dusty_nv for answering. In the meantime I have installed and succesfully ran the same application code, configuration and dataset (Mozilla TTS) on my Xavier-NX which is still on JP4.4dp and PyTorch 1.5. So I conclude there might be an application issue with PyTorch 1.6.

Is there PyTorch 1.6 for JP4.4dp available - I can’t find the PyTorch announcement message in this forum any more…?

dusty_nv · July 17, 2020, 1:58pm

Here’s the link to the PyTorch topic: https://forums.developer.nvidia.com/t/pytorch-for-jetson-nano-version-1-5-0-now-available/72048

You could build it from source for JP 4.4 DP, I don’t personally plan on building more PyTorch wheels for the DP release. You may want to post your issue to the PyTorch GitHub about changes in 1.6 that may have led to this change in behavior.

dkreutz · July 17, 2020, 3:00pm

Thanks, will try both…

dkreutz · July 18, 2020, 12:42pm

Built Pytorch 1.6rc2 from source on JP4.4 DP and see the same error.

Pytorch 1.6rc3 is available since a few days - building that right now on both JP4.4 DP and GA and will report on that later (build takes 8-10h)…

Topic		Replies	Views
Jetson AGX Xavier Pytorch Wheel files for latest Python 3.8/3.9 versions with CUDA 10.2 support Jetson AGX Xavier pytorch	5	4552	January 7, 2022
TRTorch on Jetson Xavier AGX Jetson AGX Xavier nvbugs , pytorch	6	1179	August 19, 2020
PyTorch 1.11 for JetPack 5.0 DP? Jetson Xavier NX pytorch	20	3556	May 25, 2022
Nano PyTorch 1.2.0 wheel Jetson Nano	8	1439	October 18, 2021
Cannot install PyTorch on Jetson Xavier NX Developer Kit Jetson Xavier NX pytorch	4	1877	October 18, 2021
Pytorch support Jetson Nano	31	4574	October 18, 2021
Error in pytorch & torchvision on Xavier NX JP 5.0.1 DP Jetson Xavier NX pytorch	4	2485	June 17, 2022
Installing Pytorch OSError: libcurand.so.10: cannot open shared object file: No such file or directory Jetson AGX Xavier pytorch	26	34850	October 21, 2021
How to install pytorch 1.9 or below in jetson orin Jetson AGX Orin pytorch	6	2094	June 8, 2022
Build the pytorch from source for drive agx xavier DRIVE AGX Xavier General driveos-dl	7	3335	June 8, 2020

JP4.4 production release and PyTorch 1.6rc2 issue

Related topics