Possible bug inside libnvfnet.so or nearby


Built my latency measuring app with clang address sanitizers and got this report:

==8454==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete) on 0x007faed05a90
    #0 0x4e0a28 in operator delete(void*, unsigned long) /home/tcwg-buildslave/workspace/tcwg-llvm-release/tcwg-amp/final/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:172:3

0x007faed05a90 is located 0 bytes inside of 40-byte region [0x007faed05a90,0x007faed05ab8)
allocated by thread T0 here:
    #0 0x4b2674 in malloc /home/tcwg-buildslave/workspace/tcwg-llvm-release/tcwg-amp/final/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x7fad02a830 in fnet::String::String(char const*) (/usr/lib/aarch64-linux-gnu/tegra/libnvfnet.so+0x7830)
    #2 0x7fb330db34  (/lib/ld-linux-aarch64.so.1+0xdb34)
    #3 0x7fb3311cd4  (/lib/ld-linux-aarch64.so.1+0x11cd4)
    #4 0x7fb29a4690 in _dl_catch_exception /build/glibc-D9JkfM/glibc-2.27/elf/dl-error-skeleton.c:196
    #5 0x7fb3311414  (/lib/ld-linux-aarch64.so.1+0x11414)
    #6 0x7fb2bf1010 in dlopen_doit /build/glibc-D9JkfM/glibc-2.27/dlfcn/dlopen.c:66
    #7 0x7fb29a4690 in _dl_catch_exception /build/glibc-D9JkfM/glibc-2.27/elf/dl-error-skeleton.c:196
    #8 0x7fb29a4734 in _dl_catch_error /build/glibc-D9JkfM/glibc-2.27/elf/dl-error-skeleton.c:215
    #9 0x7fb2bf277c in _dlerror_run /build/glibc-D9JkfM/glibc-2.27/dlfcn/dlerror.c:162
    #10 0x7fb2bf10e4 in dlopen /build/glibc-D9JkfM/glibc-2.27/dlfcn/dlopen.c:87
    #11 0x48aecc in dlopen /home/tcwg-buildslave/workspace/tcwg-llvm-release/tcwg-amp/final/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:6251:15
    #12 0x7fb3148a64  (/usr/lib/aarch64-linux-gnu/libv4l2.so.0+0x7a64)
    #13 0x5731d4 in NvVideoDecoder::NvVideoDecoder(char const*, int) /mnt/data_nvme/projects/uavTech2/Experimental/jetson_samples/deps/common/classes/NvVideoDecoder.cpp:69:6
    #14 0x573240 in NvVideoDecoder::createVideoDecoder(char const*, int) /mnt/data_nvme/projects/uavTech2/Experimental/jetson_samples/deps/common/classes/NvVideoDecoder.cpp:76:31
    #15 0x4e3f4c in DecoderContext::DecoderContext(int, int) /mnt/data_nvme/projects/uavTech2/Experimental/jetson_samples/JetLagTest/main.cpp:38:19
    #16 0x4e2734 in RunDecoder(std::__1::basic_string_view<char, std::__1::char_traits<char> >) /mnt/data_nvme/projects/uavTech2/Experimental/jetson_samples/JetLagTest/main.cpp:264:17
    #17 0x4e3498 in main /mnt/data_nvme/projects/uavTech2/Experimental/jetson_samples/JetLagTest/main.cpp:303:3
    #18 0x7fb28b971c in __libc_start_main /build/glibc-D9JkfM/glibc-2.27/csu/../csu/libc-start.c:310
    #19 0x442824 in _start /home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/aarch64-linux-gnu/snapshots/glibc.git~release~2.25~master/csu/../sysdeps/aarch64/start.S:83

SUMMARY: AddressSanitizer: alloc-dealloc-mismatch /home/tcwg-buildslave/workspace/tcwg-llvm-release/tcwg-amp/final/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:172:3 in operator delete(void*, unsigned long)

Backtrace between 12 and 13 is not precise enough. There should be NvV4l2Element::NvV4l2Element ctor inbetween. GDB shows that this happens inside v4l2_open call, i.e. 12 frame is actually v4l2_open inside libv4l2.so . Obviously I don’t have source for tegra mulitmedia drivers and libnvfnet, so can’t fix this by myself.

Could you please verify if this is a bug or false positive detection?

Do you hit system crash or memory leak? Please share more detail about the issue you observe.

For reference, please also share the release version:

cat /etc/nv_tegra_release

No, there is no crash nor memory leak as far as I can see. This error means some code tries to use operator delete to de-allocate memory which was allocated by malloc call (or vice versa). But it is a serious issue because:

  • C++ standard states it’s an undefined behavior
  • those errors interfere with usage of address sanitizer which is very very important tool what helps to detect bugs in software
  • even if current implementation of operator delete doesn’t crashes in this scenario there is no guarantee this is not going to be changed in future

Issue can be easily reproduced with jetson multimedia samples. Steps to reproduce on jetson device:

$ sudo apt install clang-10
$ cp -r /usr/src/jetson_multimedia_api /tmp
$ cp clang_with_sanitizers.patch /tmp
$ cd /tmp/jetson_multimedia_api
$ patch -p1 < ../clang_with_sanitizers.patch
$ cd samples/00_video_decode
$ make -j 4
$ ln -s /usr/bin/llvm-symbolizer-10 /tmp/llvm-symbolizer
$ env ASAN_SYMBOLIZER_PATH=/tmp/llvm-symbolizer ./video_decode H264 --disable-rendering ~/data/jellyfish-5-mbps-hd-h264.mkv

My tegra version:

$ cat /etc/nv_tegra_release
# R32 (release), REVISION: 6.1, GCID: 27863751, BOARD: t210ref, EABI: aarch64, DATE: Mon Jul 26 19:20:30 UTC 2021

patch: clang_with_sanitizers.patch (909 Bytes)

Please apply this and try again:
Memory Leak (Alloc/free mismatch) in Tegra multimedia API (encoder) - #6 by DaneLLL

It is a known issue and this seems related. Please help give it a try.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.