cuda-gdb accessing non-resident UVM memory crashes cuda-gdb

Attempting to access non-resident UVM memory from cuda-gdb causes cuda-gdb to crash. Here’s an example test program:

#include <stdio.h>

__global__ void VecInc(float*  vec,
                       int     N)
   int i = blockDim.x * blockIdx.x + threadIdx.x;
   if (i < N) {
      vec[i] = vec[i] + 1.0;

#define blockCount   256
#define numBlocks     20

int main(int argc, char** argv)
   cudaDeviceProp  p;
   cudaGetDeviceProperties(&p, 0);
   if (!p.managedMemory) {
      fprintf(stderr, "managed memory not supported\n");

   float*  vec;
   cudaMallocManaged((void**)&vec, numBlocks * blockCount * sizeof(float));

   VecInc<<<numBlocks, blockCount>>>(vec, numBlocks * blockCount);
   cudaError_t err = cudaGetLastError();
   if (err != cudaSuccess) {
              "kernel launch failure: %s (%d)\n",
              cudaGetErrorString(err), err);

   // Forgo the cudaThreadSynchronize and/or cudaDeviceSynchronize.  Oops!

   // Trigger a Bus error
   if (fabs(vec[0]) > 0.000001) {

   return 0;

And its Makefile:

NVCC = /usr/local/cuda/bin/nvcc
NVCCGEN = -gencode=arch=compute_53,code=\"sm_53,compute_53\"
NVCCFLAGS = $(NVCCGEN) --compiler-options -fno-strict-aliasing -DUNIX -g -G -rdc=true = $(NVCC) $(NVCCFLAGS) $(TARGET_ARCH) -c
NVLINK = $(NVCC) $(NVCCGEN) -rdc=true

it: it.o
	$(NVLINK) $^ $(LD35LIBS) -o $@

	$( -c $<

	rm -f it it.o

And an example cuda-gdb session:

/usr/local/cuda/bin/cuda-gdb ./it
NVIDIA (R) CUDA Debugger
7.0 release
Portions Copyright (C) 2007-2015 NVIDIA Corporation
GNU gdb (GDB) 7.6.2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-elf-linux-gnu".
For bug reporting instructions, please see:
Reading symbols from /mag1/todd/test/uvm/it...done.
(cuda-gdb) r
Starting program: /mag1/todd/test/uvm/./it 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/".
[New Thread 0x200129d1f0 (LWP 3105)]
[New Thread 0x20014be1f0 (LWP 3106)]

Program received signal CUDA_EXCEPTION_15, Invalid Managed Memory Access.
0x0000000000402e90 in main (argc=1, argv=0x7fffffc158) at
39         if (fabs(vec[0]) > 0.000001) {
(cuda-gdb) p vec[0]
Error: Failed to read 4 bytes of global memory from 0x1017e0000

(cuda-gdb) p vec[0]
Error: received unexpected signal: Broken pipe
BACKTRACE (1 frames):

When the “p vec[0]” command was attempted from cuda-gdb, the driver issued these errors on the console:

pgd = ffffffc0e2825000
[2001e6c000] *pgd=00000001651eb003, *pmd=0000000164666003, *pte=0000000000000000
Library at 0x2000459710: 0x20001a2000 /usr/lib/aarch64-linux-gnu/tegra/
Library at 0x20004594ec: 0x20001a2000 /usr/lib/aarch64-linux-gnu/tegra/
vdso base = 0x200001b000

This problem persits in R24.2. Although you will have to add “set cuda software_preemption on” before running the program.

Hi Todd Allen,

We’re investigating the this issue, and will try to have fix being included in coming release.
Once any clear schedule, I will post to you.
Please stay tuned.


This issue persists in L4T R24.2.1.

Tracking this with new email.


This issue is fixed in rel-27 @2017.
Would you mind to update your system with newer JetPack?