Could not find any implementation for node {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}

nadia.mostafa · March 6, 2024, 1:43pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): NVIDIA RTX A5000
• DeepStream Version: 6.3
• TensorRT Version: 8.6.1.6
• CUDA Version: 12.2
• NVIDIA GPU Driver Version (valid for GPU only): 535.129.03
• Issue Type( questions, new requirements, bugs): bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hello,

I’m trying to use an onnx file in deepstream but it fails to build trt engine with batch size greater than one with the following error

cuda_codegen.hpp:604: DCHECK(it != shape_name_map_.end()) failed. Shape tensor is not found in shape_name_map: __mye85697-HOST-(i64[1][1]so[0]p[0], mem_prop=100)
cuda_codegen.hpp:604: DCHECK(it != shape_name_map_.end()) failed. Shape tensor is not found in shape_name_map: __mye85697-HOST-(i64[1][1]so[0]p[0], mem_prop=100)
cuda_codegen.hpp:604: DCHECK(it != shape_name_map_.end()) failed. Shape tensor is not found in shape_name_map: __mye85660-HOST-(i64[1][1]so[0]p[0], mem_prop=100)
ERROR: [TRT]: 10: Could not find any implementation for node {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}.
ERROR: [TRT]: 10: [optimizer.cpp::computeCosts::3869] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}.)
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1124 Build engine failed from config file
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:816 failed to build trt engine.
0:05:14.693655320   332 0x5624d87c0090 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2022> [UID = 1]: build engine file failed
0:05:14.856545171   332 0x5624d87c0090 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2108> [UID = 1]: build backend context failed
0:05:14.856564079   332 0x5624d87c0090 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1282> [UID = 1]: generate backend failed, check config file settings
0:05:14.856579200   332 0x5624d87c0090 WARN                 nvinfer gstnvinfer.cpp:898:gst_nvinfer_start:<nvinfer0> error: Failed to create NvDsInferContext instance
0:05:14.856582381   332 0x5624d87c0090 WARN                 nvinfer gstnvinfer.cpp:898:gst_nvinfer_start:<nvinfer0> error: Config file path: /build/TestModels/cltr/config_infer_crowd_count_cltr.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

I increased workspace-size to 6144 but it didn’t fix the issue.

I also did two other experiments with different deepstream and TensorRT version

Experiment #1

Using nvcr.io/nvidia/deepstream:6.3-samples docker image

DeepStream Version: 6.3
TensorRT Version: 8.5.3.1

The engine build failed with the same error when using the default workspace-size but worked when setting workspace-size to 6144.

Experiment #2

Using nvcr.io/nvidia/deepstream:6.4-samples-multiarch docker image

DeepStream Version: 6.4
TensorRT Version: 8.6.1.6

The engine build failed with the same error when using the default workspace-size and even when setting workspace-size to 6144.

You can find the model here.

Fiona.Chen · March 7, 2024, 7:17am

Seems it is model related issue. have you tried “trtexec” engine building?

It may look like
/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --minShapes=samples:1x3x768x1024 --optShapes=samples:4x3x768x1024 --maxShapes=samples:4x3x768x1024 --fp16 --saveEngine=./model.onnx_fp16_b4.engine

And please upload your nvinfer configuration file.

nadia.mostafa · March 7, 2024, 3:04pm

Here is the configuration file for nvinfer config.txt (685 Bytes)

It failed to build using trtexec but I was able to build the engine with tensorrt python API.

Here is the python script I used to generate the engine
ONNX_to_tensorRT.txt (6.4 KB)

Fiona.Chen · March 12, 2024, 8:23am

I did not reproduce the issue in our devices with your configuration file.

nadia.mostafa · March 12, 2024, 8:51am

with TensorRT version 8.6.1.6?

Fiona.Chen · March 12, 2024, 9:38am

With the two docker containers you mentioned in Could not find any implementation for node {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}

nadia.mostafa · March 12, 2024, 11:07am

I tried generating the engine with different batch sizes in nvcr.io/nvidia/deepstream:6.4-samples-multiarch docker container and it only worked with batch-size=1.

Here is the pipeline I used

gst-launch-1.0 uridecodebin uri=file:///video_file.mp4 ! muxer.sink_0 nvstreammux name=muxer width=1280 height=720 batch-size=1 ! nvinfer config-file-path=config.txt ! nvvideoconvert ! fakesink

Fiona.Chen · March 13, 2024, 10:36am

It works with batch-size=4. No issue found with nvcr.io/nvidia/deepstream:6.4-samples-multiarch docker container.

nadia.mostafa · March 13, 2024, 3:16pm

I enabled verbose logs in trtexec and got these logs

[03/13/2024-14:40:46] [V] [TRT] =============== Computing costs for {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}
[03/13/2024-14:40:46] [V] [TRT] *************** Autotuning format combination: Bool(64,8,1), Float(16384,64,8,1), Float(4096,512,64,1), Float(4096,512,64,1), Float(4096,512,64,1), Float(4096,512,64,1), Float((* 768 N),64,1), Float((* 768 N),64,1), Float((* 768 N),64,1), Float((* 768 N),64,1), Float(1400,2,1) -> Float(25200,2100,3,1) where E0=(* 768 N) ***************
[03/13/2024-14:40:46] [V] [TRT] --------------- Timing Runner: {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]} (Myelin[0x80000023])
cuda_codegen.hpp:604: DCHECK(it != shape_name_map_.end()) failed. Shape tensor is not found in shape_name_map: __mye85777-HOST-(i64[1][1]so[0]p[0], mem_prop=100)
[03/13/2024-14:42:08] [V] [TRT] Skipping tactic 0x0000000000000000 due to exception No Myelin Error exists
[03/13/2024-14:42:08] [V] [TRT] {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]} (Myelin[0x80000023]) profiling completed in 82.4017 seconds. Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[03/13/2024-14:42:08] [V] [TRT] *************** Autotuning format combination: Bool(64,8,1), Half(16384,64,8,1), Half(4096,512,64,1), Half(4096,512,64,1), Half(4096,512,64,1), Half(4096,512,64,1), Half((* 768 N),64,1), Half((* 768 N),64,1), Half((* 768 N),64,1), Half((* 768 N),64,1), Half(1400,2,1) -> Half(25200,2100,3,1) where E0=(* 768 N) ***************
[03/13/2024-14:42:08] [V] [TRT] --------------- Timing Runner: {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]} (Myelin[0x80000023])
cuda_codegen.hpp:604: DCHECK(it != shape_name_map_.end()) failed. Shape tensor is not found in shape_name_map: __mye85777-HOST-(i64[1][1]so[0]p[0], mem_prop=100)
[03/13/2024-14:43:02] [V] [TRT] Skipping tactic 0x0000000000000000 due to exception No Myelin Error exists
[03/13/2024-14:43:02] [V] [TRT] {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]} (Myelin[0x80000023]) profiling completed in 54.4271 seconds. Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[03/13/2024-14:43:02] [V] [TRT] *************** Autotuning format combination: Bool(64,8,1), Half(2048,1:8,256,32), Half(512,1:8,64,1), Half(512,1:8,64,1), Half(512,1:8,64,1), Half(512,1:8,64,1), Float((* 768 N),64,1), Float((* 768 N),64,1), Float((* 768 N),64,1), Float((* 768 N),64,1), Float(1400,2,1) -> Half(4200,1:8,6,2) where E0=(* 768 N) ***************
[03/13/2024-14:43:02] [V] [TRT] --------------- Timing Runner: {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]} (Myelin[0x80000023])
cuda_codegen.hpp:604: DCHECK(it != shape_name_map_.end()) failed. Shape tensor is not found in shape_name_map: __mye85740-HOST-(i64[1][1]so[0]p[0], mem_prop=100)
[03/13/2024-14:43:56] [V] [TRT] Skipping tactic 0x0000000000000000 due to exception No Myelin Error exists
[03/13/2024-14:43:56] [V] [TRT] {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]} (Myelin[0x80000023]) profiling completed in 53.0703 seconds. Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[03/13/2024-14:43:56] [V] [TRT] Deleting timing cache: 269 entries, served 197 hits since creation.
[03/13/2024-14:43:56] [E] Error[10]: Could not find any implementation for node {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}.
[03/13/2024-14:43:56] [E] Error[10]: [optimizer.cpp::computeCosts::3869] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}.)

Fiona.Chen · March 14, 2024, 10:54am

We did not reproduce the error.

Topic		Replies	Views
Could not find any implementation for node {ForeignNode[onnx::MatMul_6753 + (Unnamed Layer* 1863) [Shuffle].../Reshape_41]} DeepStream SDK	2	323	January 3, 2024
ERROR: [TRT]: 10: Could not find any implementation for node /0/model.24/Expand DeepStream SDK tensorrt , onnx	6	945	March 22, 2024
Reshaping error when set batch-size greater than 1 in onnx modle DeepStream SDK	23	1269	February 10, 2023
Onnx to trt engine DeepStream SDK	5	875	October 12, 2021
Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[onnx::MatMul_9665 + (Unnamed Layer* 387) TensorRT	7	1827	March 5, 2024
Issues running Onnx classifier model in deepstream DeepStream SDK tensorrt , onnx	5	1695	October 12, 2021
In DeepStream，use onnx model DeepStream SDK	6	1487	October 12, 2021
Regarding doubts about deepstream custom parser for onnx with deepstream batch DeepStream SDK gstreamer , deepstream	5	67	September 14, 2024
(Could not find any implementation for node {ForeignNode[Transpose_2713 + (Unnamed Layer* 4032) [Shuffle]...MatMul_2714]}.) TensorRT	7	3332	January 12, 2023
[TensorRT] ERROR: Internal error: could not find any implementation for node (Unnamed Layer* 25) [Deconvolution], try increasing the workspace size with IBuilder::setMaxWorkspaceSize() TensorRT	2	4563	October 12, 2021

Could not find any implementation for node {ForeignNode[onnx::MatMul_8444 + (Unnamed Layer* 1931) [Shuffle].../ScatterND_14]}

Related topics