Error Code 1: Cudnn (CUDNN_STATUS_EXECUTION_FAILED)

Description

I want to try the TensorRT in C++ implementation of ByteTrack in Windows. However, it only supports a method in Linux. So I Convert Its Model to ONNX and then convert the onnx file to tensorrt (TRT) by using trtexec command.

trtexec.exe --onnx=bytetrack.onnx --saveEngine=bytetrack.engine --workspace=16384 --buildOnly --explicitBatch --verbose

But when I try to do inference on the project which is built in VS2019 by cmakelist.txt as the tutorial described, I encounter an error as shown in the following figure. So, how can fix it?

1

Environment

TensorRT Version: 8.2.4.2
GPU Type: RTX
Nvidia Driver Version: 511.65
CUDA Version: 11.6
CUDNN Version: 8.4.0.27
Operating System + Version: windows 10
Visual Studio: 2019

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Thanks for answering!

my output is like the below.

1. ONNX model

: too large to load up, I just only show the ByteTrack-ONNXRuntime python script.

from loguru import logger

import torch
from torch import nn

from yolox.exp import get_exp
from yolox.models.network_blocks import SiLU
from yolox.utils import replace_module

import argparse
import os


def make_parser():
    parser = argparse.ArgumentParser("YOLOX onnx deploy")
    parser.add_argument(
        "--output-name", type=str, default="bytetrack_s.onnx", help="output name of models"
    )
    parser.add_argument(
        "--input", default="images", type=str, help="input node name of onnx model"
    )
    parser.add_argument(
        "--output", default="output", type=str, help="output node name of onnx model"
    )
    parser.add_argument(
        "-o", "--opset", default=11, type=int, help="onnx opset version"
    )
    parser.add_argument("--no-onnxsim", action="store_true", help="use onnxsim or not")
    parser.add_argument(
        "-f",
        "--exp_file",
        default=None,
        type=str,
        help="expriment description file",
    )
    parser.add_argument("-expn", "--experiment-name", type=str, default=None)
    parser.add_argument("-n", "--name", type=str, default=None, help="model name")
    parser.add_argument("-c", "--ckpt", default=None, type=str, help="ckpt path")
    parser.add_argument(
        "opts",
        help="Modify config options using the command-line",
        default=None,
        nargs=argparse.REMAINDER,
    )

    return parser


@logger.catch
def main():
    args = make_parser().parse_args()
    logger.info("args value: {}".format(args))
    exp = get_exp(args.exp_file, args.name)
    exp.merge(args.opts)

    if not args.experiment_name:
        args.experiment_name = exp.exp_name

    model = exp.get_model()
    if args.ckpt is None:
        file_name = os.path.join(exp.output_dir, args.experiment_name)
        ckpt_file = os.path.join(file_name, "best_ckpt.pth.tar")
    else:
        ckpt_file = args.ckpt

    # load the model state dict
    ckpt = torch.load(ckpt_file, map_location="cpu")

    model.eval()
    if "model" in ckpt:
        ckpt = ckpt["model"]
    model.load_state_dict(ckpt)
    model = replace_module(model, nn.SiLU, SiLU)
    model.head.decode_in_inference = False

    logger.info("loading checkpoint done.")
    dummy_input = torch.randn(1, 3, exp.test_size[0], exp.test_size[1])
    torch.onnx._export(
        model,
        dummy_input,
        args.output_name,
        input_names=[args.input],
        output_names=[args.output],
        opset_version=args.opset,
    )
    logger.info("generated onnx model named {}".format(args.output_name))

    if not args.no_onnxsim:
        import onnx

        from onnxsim import simplify

        # use onnxsimplify to reduce reduent model.
        onnx_model = onnx.load(args.output_name)
        model_simp, check = simplify(onnx_model)
        assert check, "Simplified ONNX model could not be validated"
        onnx.save(model_simp, args.output_name)
        logger.info("generated simplified onnx model named {}".format(args.output_name))


if __name__ == "__main__":
    main()

2. check_model.py

: my Onnx model runs well with onnx.checker.check_model(model).

3. TRT EXEC perf.

: this is my TRT EXEC performance with -verbose options
verbose.txt (2.3 MB)

Hi,

This issue is more related to environment setup.
Please make sure CUDA and TensorRT set up correctly.
Similar issue for your reference,

Thank you.