I’m trying to use the ‘profiling injection’ sample in CUPTI for profiling Nvidia triton server or trtexec in TensorRT project with the following command:
sudo env LD_PRELOAD=/usr/local/cuda/extras/CUPTI/samples/profiling_injection/libinjection_2.so ./build/tritonserver/build/server/build/tritonserver/install/bin/tritonserver (in the dir of triton server with version of 21.07 )
or
sudo env LD_PRELOAD=/usr/local/cuda/extras/CUPTI/samples/profiling_injection/libinjection_2.so ./trtexec (in the dir of TensorRT-7.2.3.4/bin/ )
(I have added the corresponding dir to LD_LIBRARY_PATH)
I get the segment fault information as:
7978 segmentation fault sudo env
or
7847 segmentation fault sudo env ./trtexec
By using gdb, I get the following calling stack:
(gdb) bt
#0 0x00007fc145f30e1a in ?? () from /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.11.4
#1 0x00007fc145c0ee85 in ?? () from /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.11.4
#2 0x00007fc145c0d516 in ?? () from /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.11.4
#3 0x00007fc145c0daa5 in ?? () from /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.11.4
#4 0x00007fc145bbbfc4 in cuptiEnableCallback () from /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.11.4
#5 0x00007fc148d1fde0 in register_callbacks() () from /usr/local/cuda/extras/CUPTI/samples/profiling_injection/libinjection_2.so
#6 0x00007fc148d20093 in InitializeInjection () from /usr/local/cuda/extras/CUPTI/samples/profiling_injection/libinjection_2.so
#7 0x00007fc148d20152 in dlsym () from /usr/local/cuda/extras/CUPTI/samples/profiling_injection/libinjection_2.so
#8 0x00007fc0fcae9086 in ?? () from /usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.11
#9 0x00007fc148fda8d3 in call_init (env=0x7ffc183fa258, argv=0x7ffc183fa248, argc=1, l=) at dl-init.c:72
#10 _dl_init (main_map=0x7fc1491f5170, argc=1, argv=0x7ffc183fa248, env=0x7ffc183fa258) at dl-init.c:119
#11 0x00007fc148fcb0ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#12 0x0000000000000001 in ?? ()
#13 0x00007ffc183fb657 in ?? ()
#14 0x0000000000000000 in ?? ()
When I comment out CUPTI_API_CALL(cuptiEnableCallback(1, subscriber, CUPTI_CB_DOMAIN_DRIVER_API, CUPTI_DRIVER_TRACE_CBID_cuLaunchKernel));
and
CUPTI_API_CALL(cuptiEnableCallback(1, subscriber, CUPTI_CB_DOMAIN_RESOURCE, CUPTI_CBID_RESOURCE_CONTEXT_CREATED));
in line 446 and line 448 of injection_2.cpp, no segment fault occurs.
Does this means the callback function in CUPTI cannot be used to profile Nvidia triton server or TensorRT engine. And I would like to know how to use CUPTI to profile Nvidia Triton server.
The GPU hardware is V100, the driver version is 470.42.01 and CUDA version is 11.4.
Thank you.