[Jetson-TK1] Bus error when using zero copy (solved)

The Jetson-TK1 GPU shares memory with the CPU, and I recently saw an article on how to avoid memory copying between the device memory and the host memory (http://arrayfire.com/zero-copy-on-tegra-k1/). However, executing the program gives me a SIGBUS error. The following is output from gdb:

(gdb) run -x 352 -y 288 -i foreman.yuv -o out.c63 -g -s 0
Starting program: /home/***/c63_power_profiler/source/c63enc -x 352 -y 288 -i foreman.yuv -o out.c63 -g -s 0
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
padw={352, 176, 176} padh={288, 144, 144}
[New Thread 0xb5743450 (LWP 9465)]
cudaSetDeviceFlags failed (cannot set while device is active in this process)
Resetting..
[Thread 0xb5743450 (LWP 9465) exited]
Success
Found 1 CUDA devices.
[New Thread 0xb5743450 (LWP 9466)]
Multiprocessorcount: 349000

Total number of macroblocks: 1584.000000 (44, 36).
Optimal number of threads per block is: 0.004539, with maximum threads per block: 0.
Reading...
Encoding frame 0, 
 (keyframe) 
GPU_INIT_FRAME
ME FRAME
MC FRAME
DCT FRAME

Program received signal SIGBUS, Bus error.
__memcpy_neon () at ../ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S:590
590     ../ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S: No such file or directory.
(gdb) bt
#0  __memcpy_neon () at ../ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S:590
#1  0xb5c25288 in ?? () from /usr/lib/arm-linux-gnueabihf/libcuda.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

The SIGBUS does not appear if I use normal memory copying instead. I have made sure to do the required operations, as described in the CUDA RT API:

  1. Set device flags (cudaDeviceMapHost). The gdb output above shows that this operation fails the first time, but after a device reset it succeeds.
  2. cudaHostAlloc(**ptr, size, cudaHostAllocMapped)
  3. cudaHostGetDevicePointer(**dPtr, *ptr, 0)

I’m stumped as to what is happening… Any ideas? Is it a bug?

A little further experimentation from my side led me to the conclusion that the “cudaSetDeviceFlags(cudaDeviceMapHost)” causes this problem. There is no problem using zero copy without it, and the code seems to work fine.

Hi memstick,

Thanks for following up on this, it sounds like a pretty useful trick!

Hi memstick and krisrst ,
I’m also having same kind of error when i run the code . But i’ve nothing like cudaSetdeviceFlags() in my entire code. PFA below is my snippet for gdb .

Program received signal SIGBUS, Bus error.
[Switching to Thread 0xf29662a0 (LWP 17866)]
0xf697cb2c in __memcpy_neon ()
    at ../ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S:595
595	../ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S: No such file or directory.
(gdb) bt
#0  0xf697cb2c in __memcpy_neon ()
    at ../ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S:595
#1  0xf5ef8c6c in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#2  0xf5ef8f08 in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#3  0xf5f0a096 in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#4  0xf5f0a9b4 in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#5  0xf5f86104 in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#6  0xf5f8b052 in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#7  0xf5c1ebb6 in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#8  0xf5d17fe6 in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#9  0xf5d18b4e in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#10 0xf5d1911e in ?? ()
   from /usr/lib/arm-linux-gnueabihf/tegra/libnvidia-glcore.so.23.2.0
#11 0xf770f852 in glTexSubImage2D ()
---Type <return> to continue, or q <return> to quit---

I also did the backtrace but couldn’t find exact where its going wrong.
I’m trying to display an image using OpenGL but it throws this error randomly . Any help in this topic will be really appreciated .
Thanks in advance.