I was trying to launch and profile kernels of my tensortRT engine with Nsight Compute on an AGX Orin. I profile using the trtexec executable, and my trt engine comes with a plugin library, but NCU fails to launch the task:

Jetson AGX Orin Developer Kit - Jetpack 5.1.2 (L4T 35.4.1)

Command: sudo /opt/nvidia/nsight-compute/2022.2.1/ncu /usr/src/tensorrt/bin/trtexec --plugins=/usr/path/to/lib/plugin_lib.so --useSpinWait --iterations=10 --duration=0 --loadEngine=my.engine

Error log:

[02/27/2024-11:53:36] [I] TensorRT version: 8.5.2
[02/27/2024-11:53:36] [I] Loading supplied plugin library: /usr/path/to/lib/plugin_lib.so
[02/27/2024-11:53:36] [E] Could not load plugin library: /usr/path/to/lib/plugin_lib.so, due to: /lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block
[02/27/2024-11:53:36] [I] Engine loaded in 0.00654357 sec.
[02/27/2024-11:53:37] [I] [TRT] Loaded engine size: 11 MiB
[02/27/2024-11:53:37] [W] [TRT] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
[02/27/2024-11:53:37] [E] Error[1]: [pluginV2Runner.cpp::load::299] Error Code 1: Serialization (Serialization assertion creator failed.Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
[02/27/2024-11:53:37] [E] Error[4]: [runtime.cpp::deserializeCudaEngine::65] Error Code 4: Internal Error (Engine deserialization failed.)
[02/27/2024-11:53:37] [E] Engine deserialization failed
[02/27/2024-11:53:37] [E] Got invalid engine!
[02/27/2024-11:53:37] [E] Inference set up failed

What I have already tried:
By searching on the web regarding the error /lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block, I have tried the suggestion of setting PRELOAD env var to the libgomp.so.1 lib + my plugin lib, but it doesn’t change the error.

This could be related to missing support for DT_RUNPATH in the version of Nsight Compute you are using. This was resolved in version 2023.1.1. I would recommend to try switching to a newer version of the tool. You could also check if adding the directory where your plugin is present to the LD_LIBRARY_PATH environment variable can workaround the problem.

If that doesn’t help, you can set the environment variable LD_DEBUG=libs for further debugging, i.e. to understand why the shared object is not found.

Thanks for your reply! I have just discovered that since sudo prune user-defined environment variables, my LD_PRELOAD has not been passed effectively. By doing sudo --preserve-env=LD_PRELOAD LD_PRELOAD=/lib/aarch64-linux-gnu/libgomp.so.1 ncu ... the problem is solved.

