Unhandled instruction by Valgrind in libcuda

Hello,

I’m trying to debug a segmentation fault in one of my programs.

I run it with Valgrind to check for memory issues but Valgrind stops with an unhandled instruction error.

Here is the Valgrind output:

==10979== Warning: set address range perms: large range [0x3e4ee000, 0x5584c000) (defined)
ARM64 front end: load_store
disInstr(arm64): unhandled instruction 0xB8A18002
disInstr(arm64): 1011'1000 1010'0001 1000'0000 0000'0010
==10979== valgrind: Unrecognised instruction at address 0x68229f8.
==10979==    at 0x68229F8: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67A0BC3: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x697C643: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67E87CB: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x17ABE397: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x4BB33B7: __pthread_once_slow (pthread_once.c:116)
==10979==    by 0x17B0364B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AB4AAF: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AD950B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17A00F6F: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x400E8B3: call_init.part.0 (dl-init.c:72)
==10979==    by 0x400E9B3: call_init (dl-init.c:30)
==10979==    by 0x400E9B3: _dl_init (dl-init.c:119)
==10979== Your program just tried to execute an instruction that Valgrind
==10979== did not recognise.  There are two possible reasons for this.
==10979== 1. Your program has a bug and erroneously jumped to a non-code
==10979==    location.  If you are running Memcheck and you just saw a
==10979==    warning about a bad jump, it's probably your program's fault.
==10979== 2. The instruction is legitimate but Valgrind doesn't handle it,
==10979==    i.e. it's Valgrind's fault.  If you think this is the case or
==10979==    you are not sure, please let us know and we'll try to fix it.
==10979== Either way, Valgrind will now raise a SIGILL signal which will
==10979== probably kill your program.
==10979==
==10979== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==10979==   /path/to/gdb gst-launch-1.0
==10979== and then give GDB the following command
==10979==   target remote | /usr/lib/aarch64-linux-gnu/valgrind/../../bin/vgdb --pid=10979
==10979== --pid is optional if only one valgrind process is running
==10979==
==10979== valgrind: Unrecognised instruction at address 0x68229f8.
==10979==    at 0x68229F8: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67A0BC3: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x697C643: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67E87CB: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x17ABE397: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x4BB33B7: __pthread_once_slow (pthread_once.c:116)
==10979==    by 0x17B0364B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AB4AAF: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AD950B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17A00F6F: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x400E8B3: call_init.part.0 (dl-init.c:72)
==10979==    by 0x400E9B3: call_init (dl-init.c:30)
==10979==    by 0x400E9B3: _dl_init (dl-init.c:119)
==10979== Your program just tried to execute an instruction that Valgrind
==10979== did not recognise.  There are two possible reasons for this.
==10979== 1. Your program has a bug and erroneously jumped to a non-code
==10979==    location.  If you are running Memcheck and you just saw a
==10979==    warning about a bad jump, it's probably your program's fault.
==10979== 2. The instruction is legitimate but Valgrind doesn't handle it,
==10979==    i.e. it's Valgrind's fault.  If you think this is the case or
==10979==    you are not sure, please let us know and we'll try to fix it.
==10979== Either way, Valgrind will now raise a SIGILL signal which will
==10979== probably kill your program.
==10979==
==10979== Process terminating with default action of signal 4 (SIGILL)
==10979==  Illegal opcode at address 0x68229F8
==10979==    at 0x68229F8: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67A0BC3: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x697C643: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67E87CB: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x17ABE397: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x4BB33B7: __pthread_once_slow (pthread_once.c:116)
==10979==    by 0x17B0364B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AB4AAF: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AD950B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17A00F6F: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x400E8B3: call_init.part.0 (dl-init.c:72)
==10979==    by 0x400E9B3: call_init (dl-init.c:30)
==10979==    by 0x400E9B3: _dl_init (dl-init.c:119)
==10979== (action on fatal signal) vgdb me ...

I disassembled the program in gdb and the instruction at 0x68229F8 is this:

0x68229f8 swpa w1, w2 [x0]

I searched for documentation about swpa but couldn’t find any.

Do you have any pointers about how I could debug this?

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi,

Is it possible to share a simple reproducible source with us so we can check it in our environment?
Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.