I am seeing a weird issue between L4T versions. We are seeing a huge performance drop when using L4T 32.7.2 on some of our Xavier AGX compared to L4T 35.1.0 and above. Based on the testing that has been done, it seems that the problem is alleviated by moving to any L4T 35.X version. Unfortunately, we have many systems already released that are on the older 32.X version so keeping parity with them would be preferred.
To test this, I have a binary that allocates a chunk of memory in a vector and then calculates fibonacci numbers. I have been running the same binary on the both versions of L4T and on the working systems, we see it return in about 1 second while the non-working version takes much longer (5-15 seconds). When profiling, it seems to take a while to allocate the memory within the std::vector::resize() function.
Based on the reported NVIDIA PCNs, I don’t believe that our systems are affected by anything but the besides PCN208560 from 2022. I have tested on two different Xavier types and interestingly, we don’t see the problem on our much older units that were ordered in 2021.
Comparing the 699 PNs:
Working: 699-82888-0001-400 J.0
Not Working: 699-82888-0004-401 H.0
Is there anything else that I can check to help determine what this issue is?