I’m porting some CUDA code from desktop to the AGX Xavier and I just started experimenting with unified memory. The code works with small amount of data but when I hit about 12GB used RAM (as shown by tegrastats) I start getting the following error messages during the execution of a cuFFT plan:
I haven’t found much about these errors, except this thread from 2019 mentioning “huge allocation,” which is my case as I’m processing volumetric medical images that are 4GB or 8GB big. The thread mentions that the problem was supposedly fixed in JetPack 4.2, but I’m using JetPack 4.4 and still getting a similar error. Any suggestions?
May I know the total memory in your device? Standard 16GB or Xavier 32GB?
More, could you share more about the error?
Does it trigger any assertion or incorrect output value?
If yes, would you mind to share a sample to reproduce this issue?
Hi @AastaLLL, it’s a 32GB Xavier module. I’m attaching a minimal code to reproduce the issue. The attached file works with N_FRAMES up to around ~250, but anything above 300 triggers the issue.
Hi @AastaLLL, I tried your suggestion but it doesn’t seem to provide additional error info. However, I couldn’t find any examples on how to use cufftSetAutoAllocation and cufftSetWorkArea, so I’m not sure I’m doing it in the correct order. I’m attaching my modified code. Can you double-check the relevant portion?
We have rooted caused this problem.
There are some issues in allocating temporary memory for a big chunk on Jetson.
And please noted that such big memory tends to be slower on the Jetson device.
As a result, we provide two possible workarounds for you.
Divide the big chunk into several smaller memory minimal_divide.cu (2.1 KB)