Unhandled instruction by Valgrind in libcuda

Hello,

I’m trying to debug a segmentation fault in one of my programs.

I run it with Valgrind to check for memory issues but Valgrind stops with an unhandled instruction error.

Here is the Valgrind output:

==10979== Warning: set address range perms: large range [0x3e4ee000, 0x5584c000) (defined)
ARM64 front end: load_store
disInstr(arm64): unhandled instruction 0xB8A18002
disInstr(arm64): 1011'1000 1010'0001 1000'0000 0000'0010
==10979== valgrind: Unrecognised instruction at address 0x68229f8.
==10979==    at 0x68229F8: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67A0BC3: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x697C643: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67E87CB: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x17ABE397: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x4BB33B7: __pthread_once_slow (pthread_once.c:116)
==10979==    by 0x17B0364B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AB4AAF: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AD950B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17A00F6F: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x400E8B3: call_init.part.0 (dl-init.c:72)
==10979==    by 0x400E9B3: call_init (dl-init.c:30)
==10979==    by 0x400E9B3: _dl_init (dl-init.c:119)
==10979== Your program just tried to execute an instruction that Valgrind
==10979== did not recognise.  There are two possible reasons for this.
==10979== 1. Your program has a bug and erroneously jumped to a non-code
==10979==    location.  If you are running Memcheck and you just saw a
==10979==    warning about a bad jump, it's probably your program's fault.
==10979== 2. The instruction is legitimate but Valgrind doesn't handle it,
==10979==    i.e. it's Valgrind's fault.  If you think this is the case or
==10979==    you are not sure, please let us know and we'll try to fix it.
==10979== Either way, Valgrind will now raise a SIGILL signal which will
==10979== probably kill your program.
==10979==
==10979== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==10979==   /path/to/gdb gst-launch-1.0
==10979== and then give GDB the following command
==10979==   target remote | /usr/lib/aarch64-linux-gnu/valgrind/../../bin/vgdb --pid=10979
==10979== --pid is optional if only one valgrind process is running
==10979==
==10979== valgrind: Unrecognised instruction at address 0x68229f8.
==10979==    at 0x68229F8: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67A0BC3: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x697C643: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67E87CB: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x17ABE397: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x4BB33B7: __pthread_once_slow (pthread_once.c:116)
==10979==    by 0x17B0364B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AB4AAF: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AD950B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17A00F6F: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x400E8B3: call_init.part.0 (dl-init.c:72)
==10979==    by 0x400E9B3: call_init (dl-init.c:30)
==10979==    by 0x400E9B3: _dl_init (dl-init.c:119)
==10979== Your program just tried to execute an instruction that Valgrind
==10979== did not recognise.  There are two possible reasons for this.
==10979== 1. Your program has a bug and erroneously jumped to a non-code
==10979==    location.  If you are running Memcheck and you just saw a
==10979==    warning about a bad jump, it's probably your program's fault.
==10979== 2. The instruction is legitimate but Valgrind doesn't handle it,
==10979==    i.e. it's Valgrind's fault.  If you think this is the case or
==10979==    you are not sure, please let us know and we'll try to fix it.
==10979== Either way, Valgrind will now raise a SIGILL signal which will
==10979== probably kill your program.
==10979==
==10979== Process terminating with default action of signal 4 (SIGILL)
==10979==  Illegal opcode at address 0x68229F8
==10979==    at 0x68229F8: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67A0BC3: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x697C643: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x67E87CB: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1)
==10979==    by 0x17ABE397: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x4BB33B7: __pthread_once_slow (pthread_once.c:116)
==10979==    by 0x17B0364B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AB4AAF: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17AD950B: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x17A00F6F: ??? (in /usr/lib/aarch64-linux-gnu/tegra/libnvbufsurftransform.so.1.0.0)
==10979==    by 0x400E8B3: call_init.part.0 (dl-init.c:72)
==10979==    by 0x400E9B3: call_init (dl-init.c:30)
==10979==    by 0x400E9B3: _dl_init (dl-init.c:119)
==10979== (action on fatal signal) vgdb me ...

I disassembled the program in gdb and the instruction at 0x68229F8 is this:

0x68229f8 swpa w1, w2 [x0]

I searched for documentation about swpa but couldn’t find any.

Do you have any pointers about how I could debug this?

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi,

Is it possible to share a simple reproducible source with us so we can check it in our environment?
Thanks.