weird observation with CUDPP 2.1 - JIT compiler madness

cbuchner1 · April 15, 2014, 2:56pm

Hi,

we’re running a cudppCompact operation on an array of 65536 uint values using the CUDPP 2.1 library built for Compute 2.0, 3.0 and 3.5 all in one binary. This runs smoothly on all of these device categories.

Weirdly enough, when running this on a Compute 5.0 device such as a GTX 750 Ti, the thing takes minutes to just-in-time compile the code - and on Windows it even crashes (in all likelyhood due to a stack overrun). In debug mode I get a message that PTXAS has run out of memory before it crashes, in Release builds it just crashes.

What’s with the JIT compilation for Maxwell taking so immensely long?

Christian

njuffa · April 15, 2014, 6:27pm

Presumably the JITing is taking a long time because the code is really large. This could be because of the use of templates, however I am not familiar with CUDPP. The fact that the JIT compiler runs out of memory would also jibe with a working hypothesis of very large code size.

It sounds like the application as built as a fat binary that contains machine code for the sm_20, sm_30, and sm_35 architectures. So no JITing is necessary for these platforms. The best course of action would be to add sm_50 to the architecture targets of the fat binary build to avoid the need to JIT on Maxwell GPUs.

You may want to consider filing a bug regarding the compiler behavior on Windows. Orderly abnormal termination with an error message seems an appropriate response when the JIT compiler runs out of memory, but an outright crash should not happen.

cbuchner1 · April 17, 2014, 8:56am

seems the CUDPP team is considering adding official support for CUDA 6.0 and the sm_50 build target.

Topic		Replies	Views
X64,VC2013,WIN8.1 , cudaMallocPitch block forever. CUDA Programming and Performance	12	2628	March 9, 2016
cudpp compile error CUDA Programming and Performance	2	4735	May 22, 2008
JIT compilation PTX to machine code may fail for certain GPUs ? CUDA Programming and Performance	4	6225	January 21, 2015
CudaMalloc taking very long CUDA Programming and Performance	11	1339	April 27, 2017
PTXAS Fatal: Memory Allocation Failure CUDA Programming and Performance	10	3655	April 10, 2017
Avoiding JIT compiling on system with 2 different GPUs CUDA Programming and Performance	6	1180	June 22, 2017
JIT Compile Fails Silently CUDA Programming and Performance	3	13339	January 18, 2011
Strange bug with Gtx 750ti CUDA Programming and Performance	3	905	September 19, 2014
nvcc / pxtas using enough memory? Compilation slow, and getting slower... CUDA Programming and Performance	3	7782	February 9, 2010
CUDPP 2.1 with CUDA 5.5 solving tridiagonal sytems GPU-Accelerated Libraries	1	658	March 24, 2015

weird observation with CUDPP 2.1 - JIT compiler madness

Related topics