Hello,
I was trying to launch and profile kernels of my tensortRT engine with Nsight Compute on an AGX Orin. I profile using the trtexec
executable, and my trt engine comes with a plugin library, but NCU fails to launch the task:
Platform:
Jetson AGX Orin Developer Kit - Jetpack 5.1.2 (L4T 35.4.1)
Nsight-compute-2022.2.1
Command: sudo /opt/nvidia/nsight-compute/2022.2.1/ncu /usr/src/tensorrt/bin/trtexec --plugins=/usr/path/to/lib/plugin_lib.so --useSpinWait --iterations=10 --duration=0 --loadEngine=my.engine
Error log:
[02/27/2024-11:53:36] [I] TensorRT version: 8.5.2
[02/27/2024-11:53:36] [I] Loading supplied plugin library: /usr/path/to/lib/plugin_lib.so
[02/27/2024-11:53:36] [E] Could not load plugin library: /usr/path/to/lib/plugin_lib.so, due to: /lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block
[02/27/2024-11:53:36] [I] Engine loaded in 0.00654357 sec.
[02/27/2024-11:53:37] [I] [TRT] Loaded engine size: 11 MiB
[02/27/2024-11:53:37] [W] [TRT] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
[02/27/2024-11:53:37] [E] Error[1]: [pluginV2Runner.cpp::load::299] Error Code 1: Serialization (Serialization assertion creator failed.Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
[02/27/2024-11:53:37] [E] Error[4]: [runtime.cpp::deserializeCudaEngine::65] Error Code 4: Internal Error (Engine deserialization failed.)
[02/27/2024-11:53:37] [E] Engine deserialization failed
[02/27/2024-11:53:37] [E] Got invalid engine!
[02/27/2024-11:53:37] [E] Inference set up failed
What I have already tried:
By searching on the web regarding the error /lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block
, I have tried the suggestion of setting PRELOAD env var to the libgomp.so.1 lib + my plugin lib, but it doesn’t change the error.