[TensorRT] ERROR: …/rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)

bhargav.ravat · October 22, 2020, 7:36am

I have converted YOLOV3 model to onnx and from onnx to TRT model .

Now I am trying to run inference on received image using socketio . Here is my code:

import os
import time
import argparse
import numpy as np
import cv2
import pycuda.autoinit # This is needed for initializing CUDA driver
import socketio
import base64
from utils.yolo_classes import get_cls_dict
from utils.camera import add_camera_args, Camera
from utils.display import open_window, set_display, show_fps
from utils.visualization import BBoxVisualization
from utils.yolo_with_plugins import TrtYOLO

global conf_th

conf_th = 0.3
‘’’
Loading Model
‘’’

global trt_yolo

trt_yolo = TrtYOLO(“yolov3-custom-416”, (416, 416), 3)

print (“trt_yolo ==>”, trt_yolo )

WINDOW_NAME = ‘TrtYOLODemo’

inputShape = (300,300)

‘’’
Shinobi Plugin Variables
‘’’
shinobiPLuginName = “NoMask”
shinobiPluginKey = “NoMask123123”
shinobiHost = ‘http://192.168.0.109:9090’

‘’’
Socker IO Connection with Reconnection
‘’’
sio = socketio.Client(reconnection=True,reconnection_delay=1,ssl_verify = False)
sio.connect(shinobiHost,transports=‘websocket’)
sio.emit(‘ocv’,
{‘f’:‘init’,‘plug’:shinobiPLuginName,‘type’:‘detector’,‘connectionType’:‘websocket’,‘pluginKey’:shinobiPluginKey})

#Socket IO Connection Event , Built in Reconneciton Logic
@sio.event
def connect():
print(‘connection established :’)
sio.emit(‘ocv’,
{‘f’:‘init’,‘plug’:shinobiPLuginName,‘type’:‘detector’,‘connectionType’:‘websocket’,‘pluginKey’:shinobiPluginKey})

#Socket IO Re Connection Event
@sio.event
def reconnect():
print (“Reconnection established :”)
sio.emit(‘ocv’,
{‘f’:‘init’,‘plug’:shinobiPLuginName,‘type’:‘detector’,‘connectionType’:‘websocket’,‘pluginKey’:shinobiPluginKey})

#Socket IO Disconnect Event
@sio.event
def disconnect():
print(‘disconnected from server’)

def yolo_detection(img_np,trt_yolo,recvdImg,height, width,shinobiId,shonibiKe):
frame = img_np
trt_yolo = trt_yolo
print (“trt_yolo_YOLODETECTION”, trt_yolo)
(h, w) = frame.shape[:2]
#shinobiIdSend = sId
#shonibiKeSend = ske
recvdImg = recvdImg
boxes, confs, clss = trt_yolo.detect(frame, conf_th)
print ("boxes ", boxes)
print (“confs”, confs)
print (“clss” , clss)

#f event ! , Frame will be recived in this fucntion
@sio.event
def f(data):
# print (“on_f”)
#print (“Data”,data)
# print (“type(data)”,type(data))
# print (“data[ke]”,data.get(“ke”))
# print (“data[f]”,data.get(“f”))
# print (“data[id]”,data.get(“id”))
shinobiId = data.get(“id”)
shonibiKe = data.get(“ke”)
#print (“data[frame]”,data.get(“frame”))
recvdImg = data.get(“frame”)
#print (“Type of Image”, type(recvdImg))
#print (“Length of Image”,len(recvdImg))
nparr = np.fromstring(recvdImg, np.uint8)
print (“trt_yolo ON F!! ==>”, trt_yolo )
img_np = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
#img_np = cv2.resize(img_np,inputShape,interpolation = cv2.INTER_AREA)
#print (“Image Recieved !!!”)
#cv2.imwrite(‘recvdImg.jpg’,img_np)
yolo_detection(img_np,trt_yolo,recvdImg,img_np.shape[0],img_np.shape[1],shinobiId,shonibiKe)

It will wait for socket io events!!

sio.wait()

AastaLLL · October 22, 2020, 8:19am

Hi,

Could you share a complete error log with us first?
Thanks.

bhargav.ravat · October 22, 2020, 8:34am

Hi @AastaLLL :

Here is the Error Log !

Blockquote[TensorRT] WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.

Blockquote[TensorRT] ERROR: …/rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)

Blockquote [TensorRT] ERROR: FAILED_EXECUTION: std::exception

AastaLLL · October 23, 2020, 3:47am

Hi,

Based on the log, do you generate the TensorRT plan file from the same platform and the same software version?
Please note that the TensoRT engine is not portable. You will need to generate the file from the same environment.

Thanks.

bhargav.ravat · October 23, 2020, 4:04am

HI @AastaLLL,

Yes I am using the same platform and the same environment. [JETSON NANO]
After conversion of the model I have checked it with local images. And the model seems to be working OKAY.

The issue I am facing is with the socketio connection in combination with model.

If you see the code which I have shared , I am initially loading the model and every time I recieve the frame from websocket then I am trying to do inference on the received image . At that point of time I am getting this error.

If i run on local image and / or RTSP feeds the model is working OKAY.

Is there any mistake I am doing while loading the model in the above code ?trt_mask_plugin.txt (3.1 KB)

I have attached my python code file.

bhargav.ravat · October 23, 2020, 4:11am

Here is the complete details of board:

NVIDIA Jetson Nano (Developer Kit Version)
L4T 32.4.4 [ JetPack UNKNOWN ]
Ubuntu 18.04.5 LTS
Kernel Version: 4.9.140-tegra
CUDA 10.2.89
CUDA Architecture: 5.3
OpenCV version: 4.1.1
OpenCV Cuda: YES
CUDNN: 8.0.0.180
TensorRT: 7.1.3.0
Vision Works: 1.6.0.501
VPI: 0.4.4

bhargav.ravat · October 27, 2020, 3:01am

hi @AastaLLL ,

If I am not wrong I am getting this error because I am not handling async events . Can you please help me with how can I handle async received images and do inference on them ?

AastaLLL · October 27, 2020, 5:23am

Hi,

Not sure if I understand your problem correctly.
It seems that you try to run the inference as a kind of callback function from the internet.

Then, a common error is that the CUDA context is refreshed and mixed up with other applications.
Please store the CUDA context before leaving the yolo_detection function and restore it when back.

A similar example can be found in this topic:

Thanks.

studentparth18 · January 6, 2021, 4:31am

Hey @AastaLLL, I have a similar problem. Could you please take a look and let me know if the CUDA context is the thing causing an issue.

So, the original workflow was as per this repository:

Create TensorRT backends for YOLOv4 and a feature extraction model

Use asynchronous processing for object detection on the Jetson Nano
Use async processing for getting feature embeddings for each detection on the Jetson Nano
Carrying out object tracking

I tried to modify the code to create a new workflow incorporating Google Coral. My aim was to run the detection on the Coral using TFlite and the feature extraction on the Nano using TensorRT:

Allocate buffers for the TFLite interpreter
Create a TensorRT backend for the feature extraction model
Use synchronous processing to infer detections using TFLite
Use async processing for getting feature embeddings for each detection on the Jetson Nano
Carrying out object tracking

The original workflow was working perfectly, but in the new workflow, I am facing the CUDA error mentioned in this thread. Is this issue due to the context being refreshed?

studentparth18 · January 6, 2021, 1:26pm

I did try using the CUDA context push and pop functions to make it work. The issue I am facing is that the Jetson Nano becomes unresponsive whenever I try to push the CUDA context. I have no idea why this occurs. Do you have any tips for this?

Topic		Replies	Views
"Cuda Error in NCHWTONCHHW2: 33 (invalid resource handle) "，How to solve it? Jetson Nano cuda	30	6041	October 18, 2021
Engine Plan Inference on JetsonTX2 Jetson TX2 tensorrt , python	11	1845	October 18, 2021
Tensorrt Inference in Real time Jetson Nano tensorrt , jetson-inference , gstreamer , python	8	1730	April 12, 2023
Issue with ssd-mobielnetv2 using jetson-inference on jetson nano orin 8gb Jetson Orin Nano jetson-inference	13	31	August 22, 2024
Error Code 1: Cask (Cask convolution execution) TensorRT tensorrt , cuda	3	1612	March 4, 2024
TensorRT Inference error on Jetson nano TensorRT	3	1188	December 6, 2021
Unable to run two TensorRT models in a cascade manner TensorRT tensorrt , python	7	4964	October 12, 2021
cuda error running YOLO-TensorRT-GIE- and ZED Jetson TX2	16	3496	February 21, 2018
TensorRT ERROR: pointWiseV2Helpers.h::launchPwgenKernel::532 Cuda Driver (invalid resource handle) Jetson Xavier NX tensorrt , cuda , jetson-inference	3	2058	March 24, 2022
Cuda initialization failure when converting trt model with different GPU TensorRT tensorrt	7	6448	September 28, 2022

[TensorRT] ERROR: …/rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle)

It will wait for socket io events!!

Related topics