NVPROF randomly fails (no kernels/API calls, code 139) on the exact same code

zjw518 · June 9, 2017, 9:37pm

I’m running nvprof on my code, and ~70% of the time it will quit, saying no kernels or API calls, “application received signal 139.” And sometimes it will work. I am not oversubscribing the GPU’s memory, nor the shared memory nor registers. I’m literally compiling and running nvprof repeatedly without changing anything.

I can’t replicate this behavior in cuda-memcheck; it runs without a problem every time. And when I run my code sans memcheck or nvprof, it also runs fine.

I thought the issue was related to transfers from global (device) memory, but now I have no way of knowing because nvprof is not behaving deterministically. I was trying the good-ol’ comment out all but one line method.

What on earth could cause this to happen?

zjw518 · June 10, 2017, 1:08am

I think again that this is a memory issue. I think somehow the compiler/whatever isn’t catching that I’m trying to access an un-allocated address. Indeed, if I try to access the Nth, (N+1)th, etc., element of the length-N array (living in device memory) that I pass to the kernel, no error is thrown. My guess is that the positioning of these arrays in device memory is occasionally close enough to “the edge” to throw errors. By increasing N by a large factor, the rate of NVPROF failing becomes almost 100%.

I have no idea how this is happening. I have wrapped the contents of my kernel in a conditional to ensure that all accessed array indices are less than N. It makes no difference. I don’t know how I can 1) be able to access unallocated memory, 2) not have any threads corresponding to these illegal indices, but 3) still have NVPROF crash randomly.

For context, I am doing a 3D integration of partial differential equations, and the change to the code which has induced this behavior was increasing the dimension of the thread block to cover all elements needed by spatial derivatives (stencils) while only updating the threads inside this “halo.” I believe this is the standard/optimal practice (sure beats manually loading the halo elements, as I was before), and yes I am certain that my blocks have the correct dimension, but the dimension of my grid is set as if the blocks were not including the halo. I have additionally checked whether blockIdx.x * (dimension of the inner/enclosed block) is equal to the dimension of the full lattice.

Any insight or suggestions would be greatly appreciated.

Robert_Crovella · June 11, 2017, 5:03pm

cuda-memcheck can help with illegal/out-of-bounds accesses, for both global and shared mem

zjw518 · June 11, 2017, 5:04pm

Like I said, memcheck runs perfectly every time.

Robert_Crovella · June 11, 2017, 5:38pm

If you access the n+1th element in a length n array, cuda-memcheck will catch that. If cuda-memcheck throws no errors, it’s likely that your code is not making any out-of-bounds accesses.

if you google “application received signal 139” you may get some things worth reading.

zjw518 · June 11, 2017, 5:42pm

Thanks, but I’ve read every single result on google, with nothing helping. For all I’ve seen, nothing would make sense other than this being a bug with nvprof, but I’m not expert, of course.

Robert_Crovella · June 12, 2017, 4:11am

perhaps you should file a bug at developer.nvidia.com

however, that is unlikely to go anywhere unless you can provide a reproducible test case

Topic		Replies	Views
nvprof error code 139 but memcheck OK Visual Profiler and nvprof	14	13846	December 11, 2020
Illegal memory access during memset when running under nvprof CUDA Programming and Performance	1	1087	December 1, 2019
nvprof crash (error signal 134) CUDA Programming and Performance	1	2916	August 16, 2014
Code runs w/o debugging on, runs incredibly slow with cuda-memcheck; no kernels called with nvprof; CUDA Programming and Performance	8	1650	June 9, 2017
Error: Internal profiling error 4142:999 Visual Profiler and nvprof	0	846	October 20, 2020
Always got this warning when nvprof cuda file "This can happen if device ran out of memory or if a device kernel was stopped due to an assertion" on just HellowWorld GPU CUDA Programming and Performance	9	2591	January 31, 2019
Kernel execution in RTX2080 gets freezed, and cudaErrorIllegalAddress with nvprof CUDA Programming and Performance	3	767	June 30, 2019
NVProf error on samples CUDA Programming and Performance	28	20492	December 29, 2020
nvprof make cudaMemset failed Jetson AGX Xavier	9	2232	October 18, 2021
Unable to profile application Visual Profiler and nvprof	3	10399	December 18, 2013

NVPROF randomly fails (no kernels/API calls, code 139) on *the exact same code*

Related topics

NVPROF randomly fails (no kernels/API calls, code 139) on the exact same code