Ptxas fatal: Memory allocation failure

Abdopensky · April 6, 2019, 8:02am

Hello,

I am trying to compile my cuda file and it gives me the following error: Ptxas fatal: Memory allocation failure. I am using a 64-bit application (debug mode) under Visual Studio 2017. As i’m using a Nvidia Quadro P2000, i put compute_61, sm_61.

I tried the release mode which works fine !

Do you know the reason of the issue when trying to compile on debug mode.

Thanks

njuffa · April 6, 2019, 2:03pm

I have never encountered this. The message seems to indicate pretty clearly that PTXAS (the optimizing compiler that translates the PTX intermediate representation into machine code) requested a dynamic memory allocation which failed. Release builds use full optimization, while debug builds use no optimization whatsoever. The input to PTXAS can therefore differ a lot between debug and release builds, as can the output. The size of the code generated for a debug build could be larger or smaller than for a release build.

Hypothesis 1: There is extraordinarily little system memory available when PTXAS runs
Hypothesis 2: PTXAS needs an extraordinary large amount of memory to do its work

(1) Are you able to reproduce the issue reliably? How much system memory is available when PTXAS runs? What is the total amount of system memory on the machine used to compile the code?

(2) How large is the CUDA source code? How large is the PTX code being passed to PTXAS (how many lines, how many kilobytes)? How long does PTXAS run before it fails with the “failed allocation” error? How long does it run in the corresponding release mode build? When you monitor PTXAS memory usage while it runs, how much memory does it use?

If PTXAS runs for a very long time in the debug build (say more than twice as long as for the release build, or more than 10 minutes) before it fails, and you can observe a continuously increasing memory usage of PTXAS during that time, this would be a good indication of a memory leak, infinite loop, or other bug within PTXAS. In which case you would want to file a bug report with NVIDIA.

Abdopensky · April 7, 2019, 4:57am

Hi njuffa,

I see now that it’s memory usage issue

I have 15,9GB on my system memory. Once the compiler is with CUDA, the used memory goes from 4.4GB to 15.4GB in matter of 3 seconds lol. So this memory usage will stay constant until the compilation failure.
I have around 35000 lines of cuda code (Size = 2761KB). It takes 1-2 hours before it fails.

njuffa · April 7, 2019, 5:41am

That is 35 KLOC for a single kernel? And this kernel takes 1+ to compile, and then PTXAS blows up?

While that is large as CUDA kernels go, it should probably not cause PTXAS to chew through all your memory and then blow up with a failed allocation. Consider filing a bug with NVIDIA so the compiler folks can have a look whether PTXAS is using more memory than it should. The lengthy compilation time also seems indicative of a problem. I would expect the code to compile in maybe 10 to 15 minutes. How long does your release build take to compile the same code?

Abdopensky · April 7, 2019, 6:11am

Yes. I have a very large kernel around 30.000 lines. I just put it in comment and so as it will be not considered as part of source code, the compilation went through right away. For release mode, yes the compilation takes around 10-15 minutes

njuffa · April 7, 2019, 6:15am

On second thought, the long compilation time may simply be a side effect of the memory usage as the system starts swapping before it runs out of memory.

I assume this is some sort of generated code, since I can’t imagine a human writing a 30,000 line kernel.

Abdopensky · April 7, 2019, 8:32am

lol I spent at least one year for this kernel. Anyway, I’ll find a way to debug.

Thanks for your help !

Cheers

Abdoulaye

Topic		Replies	Views
cuda memory usage in debug(with GDB),debug(without GDB) and release differ, extra 2GB usage in relea CUDA Programming and Performance	11	4208	February 9, 2016
Is there a chance that Ptxas.exe will use all cores of the CPU ? This would be a great improvement o CUDA Programming and Performance	10	8841	December 30, 2010
ptxas fatal : Memory allocation failure compilation error CUDA Programming and Performance	4	3838	September 29, 2010
Ptxas compiler speed. CUDA Programming and Performance	23	12181	December 20, 2012
Very long kernels resulting in unoptimized compilation CUDA Programming and Performance	2	460	March 10, 2023
CUDA V7.0 Release Mode Compile error: nvcc error : 'ptxas' died with status 0xC0000005 (ACCESS_VIOL CUDA Programming and Performance	7	3471	March 29, 2017
PTXAS on 32-bit causes ptxas Memory allocation failure PTXAS on 32-bit causes ptxas fatal : Memory a CUDA Programming and Performance	0	1766	June 14, 2010
ptxas resource requirements! CUDA Programming and Performance	9	7848	July 29, 2010
very slow compile CUDA Programming and Performance	7	2302	February 8, 2012
BUG: Broken register allocation, toolkit 2.3 CUDA Programming and Performance	15	6914	May 10, 2010

Ptxas fatal: Memory allocation failure

Related topics