Could not find any implementation for node

mattangus02 · September 25, 2024, 1:27pm

Description

I get the following error when trying to run trtexec with my onnx model.

[E] Error[10]: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[/ReduceMax.../ScatterND_6]}.)

This error only occurs on this version of TRT, but unfortunately due to client constraints I cannot upgrade my jetpack version.

Environment

TensorRT Version: 8.5.2.2+cuda11.8.0.065+520.61.05+cublas11.11.3.6
GPU Type: RTX2080TI and jetson orin (jetpack 5.1)
Nvidia Driver Version: 555.42.06
CUDA Version: cuda11.8
CUDNN Version: 8.7.0.84
Operating System + Version: jetpack / ubuntu 22.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:23.01-py3

Relevant Files

Here is a log file with the --verbose flag trtexec_r.log - Google Drive

Steps To Reproduce

trtexec --onnx=./data/od/odox-xs-ha7-n3hw.onnx --shapes=input:1x3x512x960

AakankshaS · September 30, 2024, 7:55pm

Hi @mattangus02 ,
Can you please try with the updated TRT,
If the issue persist, please share the model and repro scripts with us

Thanks

richard95 · October 1, 2024, 1:46pm

As Matt explained, we are not able to upgrade TensorRT as this model is destined for a device which has already been through substantial compliance testing on Jetpack 5.1 and they are not able to update to Jetpack 6.1 (which is the only version which provides TensorRT 10). We can confirm that the model does execute currently with TensorRT 10, however. Can you please advise on how we might make this model work with Jetpack 5.1 (i.e. Tensor RT 8)

richard95 · October 1, 2024, 1:55pm

@AakankshaS I have shared the model with you in a private message.

Thanks!

Richard

richard95 · October 3, 2024, 6:25am

@AakankshaS Could you provide an update please? We have a client waiting for our release.

Thanks,

Richad.

AastaLLL · October 7, 2024, 2:05am

Hi, @richard95

Could you share the model with me as well?

Thanks.

richard95 · October 7, 2024, 12:25pm

I have done so in a DM - thanks!

AastaLLL · October 8, 2024, 6:21am

Hi,

Confirmed that we get the model in private message.
We will give it a check and share more info with you later.

Thanks.

AastaLLL · October 9, 2024, 6:27am

Hi,

Confirmed that we can reproduce the same issue locally.

Based on the output, we suspect this is related to a known issue that is fixed in JetPack 6.
As we need more time to verify the cause, are you able to run the converter for more times?
The known issue has a 50% failure rate (time-related), it should be possible to get and serialize a working engine if tested multiple times.

Thanks.

richard95 · October 9, 2024, 6:45am

Thanks @AastaLLL - it’s great that you can reproduce. Would you mind pointing us to the relevant statement in the release notes which shows the issue is known and resolved? This will help us build a case with our customer to upgrade to JetPack 6.x in the medium term. In the short term they will not be able to upgrade as they have been through significant compliance testing with JetPack 5.1, so we will need to find a workaround. Do you have any suggestions?

I will ask an engineer this morning to rerun a few times to see if it ever converts.

Kind regards,

Richard

richard95 · October 9, 2024, 12:53pm

@AastaLLL We’ve run it around 20 times, and so far its failed every time with the same error. Could you perhaps expand on what you mean by “time based”? I’m going to set it up so that it will run continuously overnight so we should get several thousand runs, and I’ll let you know if any succeed.

Kind regards,

Richard.

richard95 · October 10, 2024, 6:32am

@AastaLLL We ran this continuously overnight, which totalled 802 executions, and they all failed with the same error.

Kind regards,

Richard

richard95 · October 14, 2024, 8:20am

@AastaLLL Could we have an update please? This looks like this is not the intermittent issue that you thought perhaps it might be?

AastaLLL · October 14, 2024, 8:31am

Hi,

Sorry for the late update.

The issue is related to the CUDA driver and is fixed on JetPack 6.0 DP as an updated version.
So JetPack 6.0 GA or 6.1 also has the patch too.

The issue is cudaEventElapsedTime sometimes will fail to report and cause this issue (time: inf)
And the failure rates seem to be model-dependent.

We are still checking if any WAR is available. Will provide you with more info later.

Thanks.

richard95 · October 14, 2024, 8:43am

Thanks @AastaLLL I’ll fill our customer in. Thanks very much for the help so far!

richard95 · October 21, 2024, 8:51am

Hi @AastaLLL, are you any closer to finding a workaround? Or perhaps do you have some suggestions for things we might try which might affect the outcome so that we can have a look ourselves?

Thanks,

Richard.

AastaLLL · October 23, 2024, 10:05am

Hi,

Sorry for the late update.

We need more time for this issue as we are running out of resources recently.
To check how to WAR this issue, it’s recommended to try our ONNX graph_surgeion tool:

As your model fails around the ‘/ScatterND_6’ layer, we recommend setting the model output with the input and output of ‘/ScatterND_6’.

If the conversion passes without the layer but fails once adding the layer.
Then you can try to replace or play around the layer to find a way to WAR this issue.

Thanks.

victor1111 · October 29, 2024, 10:33am

Hi @AastaLLL

Here is a minimal reproducible example of the operation that causes the error

import torch


class DummyOp(torch.nn.Module):

    def __init__(self) -> None:
        super().__init__()

    def forward(self, features: torch.Tensor) -> torch.Tensor:
        batch = features.size(0)
        n = features.size(1)
        
        diag = torch.eye(n, device=features.device)
        diag = diag.repeat(batch, 1, 1)
        non_diag_mask = diag == 0

        matrix = torch.zeros_like(diag)
        matrix[non_diag_mask] = features.flatten()  # error
        return matrix


def main():
    n = 10

    model = DummyOp()
    data = torch.rand([1, n, n - 1]).to(torch.float32)

    torch.onnx.export(
        model,
        data,
        "model.onnx",
        export_params=True,
        opset_version=11,
        do_constant_folding=True,
        input_names=["input"],
        output_names=["outputs"],
        verbose=True,
    )


if __name__ == "__main__":
    main()

This warning is printed before the error, so it may be a useful hint:

[W] [TRT] Skipping tactic 0x0000000000000000 due to exception Assertion sliceOutDims[i] <= inputDims.d[i] failed.

I managed to get around this by flattening the tensors and indexing with a flat index. However, I think it would be good to know the root cause of this.

AastaLLL · October 30, 2024, 8:22am

Hi, @victor1111

Do you have the same issue?

You can find the root cause of this issue in the below command:

Thanks.

victor1111 · October 30, 2024, 9:16am

Hi @AastaLLL

I work with Matt and Richard. The code I posted above is the layer that caused the error in the model sent by Richard.

I reimplemented the layer to use indices instead of masks for the scatter operation, which fixed the conversion error.

I think this issue can be closed now.

Thanks!

Topic		Replies	Views
Could not find any implementation for node {foreignNode[onnx::scatterND 345 ... Transpose 19675]} TensorRT cudnn	5	447	July 10, 2024
(Could not find any implementation for node {ForeignNode[Transpose_2713 + (Unnamed Layer* 4032) [Shuffle]...MatMul_2714]}.) TensorRT	7	3332	January 12, 2023
Error while converting my onnx model : Could not find any implementation for node TensorRT	3	1633	April 7, 2022
Skipping tactic 0x0000000000000000 due to Myelin error: Platform (Cuda) error Jetson Orin NX tensorrt	25	2443	January 25, 2023
Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[668...Mul_497] TensorRT tensorrt , nvbugs , onnx	6	3104	October 25, 2024
Failed to convert model using tensorrt 8.5.3 TensorRT cudnn	1	738	November 17, 2023
Could not parse ONNX model from file TensorRT	9	3844	January 24, 2024
ERROR: [TRT]: 10: Could not find any implementation for node /0/model.24/Expand DeepStream SDK tensorrt , onnx	6	945	March 22, 2024
Errors when Converter to TensorRT by torch2trt Jetson Xavier NX tensorrt	2	486	January 31, 2024
Parseq tensorrt conversion takes for ever to complete TensorRT cudnn	1	54	August 30, 2024

Could not find any implementation for node

Description

Environment

Relevant Files

Steps To Reproduce

Related topics