Cannot build PyTorch 2.0.0 from source on AGX Xavier

I’m trying to build PyTorch 2.0.0 from source on an AGX Xavier running JetPack 5.1.1. This looks to me like a GCC issue, not anything in either PyTorch or JetPack itself:

[4974/6104] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/kineto_client_interface.cpp.o
FAILED: caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/profiler/kineto_client_interface.cpp.o

...

/home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/torch/csrc/profiler/kineto_client_interface.cpp:25:8: error: ‘void torch::profiler::impl::{anonymous}::LibKinetoClient::warmup(bool)’ marked ‘override’, but does not override
   25 |   void warmup(bool setupOpInputsCollection) override {
      |        ^~~~~~
/home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/torch/csrc/profiler/kineto_client_interface.cpp:50:8: error: ‘void torch::profiler::impl::{anonymous}::LibKinetoClient::set_withstack(bool)’ marked ‘override’, but does not override
   50 |   void set_withstack(bool withStack) override {
      |        ^~~~~~~~~~~~~
/home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/torch/csrc/profiler/kineto_client_interface.cpp: In constructor ‘torch::{anonymous}::RegisterLibKinetoClient::RegisterLibKinetoClient()’:
/home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/torch/csrc/profiler/kineto_client_interface.cpp:69:44: error: cannot declare variable ‘client’ to be of abstract type ‘torch::profiler::impl::{anonymous}::LibKinetoClient’
   69 |     static profiler::impl::LibKinetoClient client;
      |                                            ^~~~~~
/home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/torch/csrc/profiler/kineto_client_interface.cpp:21:7: note:   because the following virtual functions are pure within ‘torch::profiler::impl::{anonymous}::LibKinetoClient’:
   21 | class LibKinetoClient : public libkineto::ClientInterface {
      |       ^~~~~~~~~~~~~~~
In file included from /home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/third_party/kineto/libkineto/include/libkineto.h:26,
                 from /home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/torch/csrc/profiler/kineto_client_interface.cpp:2:
/home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/third_party/kineto/libkineto/include/ClientInterface.h:17:16: note:    ‘virtual void libkineto::ClientInterface::prepare(bool, bool, bool, bool, bool)’
   17 |   virtual void prepare(bool, bool, bool, bool, bool) = 0;
      |                ^~~~~~~

Any ideas? I do know it didn’t run out of RAM. I’ve going to try it again with MAX JOBS = 4 just to be sure, but there were no killed processes in dmesg.

Hi,

We have PyTorch 2.0 package and container for Jetson.
So you don’t need to build it from the source:

Container: NVIDIA L4T PyTorch | NVIDIA NGC
Packages: Installing PyTorch for Jetson Platform - NVIDIA Docs

Thanks.

I know there are pre-built wheels - I need to build for newer versions of Python than 3.8. I started with 3.8 to make sure the build script was working before I ventured off into the newer Python releases.

Hi @znmeb, I haven’t seen this error before, but my guess is that there is a mismatch with the version of the libkineto submodule.

Can you check that your /home/znmeb/Projects/AlgoCompSynth-One/JetPack5/Projects/pytorch/third_party/kineto/libkineto/include/ClientInterface.h file looks like this?

Due to the function signatures in the error messages, my guess is that it doesn’t, and your local copy of PyTorch’s libkineto submodule got out-of-sync with the v2.0.0 branch somehow (note that how if you change to the master branch of libkineto, the functions have changed are similar to what you see in your errors)

When you cloned PyTorch, did you do it like git clone --recursive --branch v2.0.0 http://github.com/pytorch/pytorch ? If you can’t get it working, you can try setting export USE_KINETO=0 starting compiling with setup.py.

That was it - I re-used an old script and the git clone sequence was incorrect. I fixed the script and it works now. Incidentally, it takes about three hours on an AGX Xavier. I don’t have an Orin yet.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.