"Cuda Error in NCHWTONCHHW2: 33 (invalid resource handle) "，How to solve it?

gzchenjiajun · March 5, 2020, 7:02am

What’s the problem?

AastaLLL · March 6, 2020, 2:04am

Hi,

cudaErrorInvalidResourceHandle = 400
This indicates that a resource handle passed to the API call was not valid. Resource handles are opaque types like cudaStream_t and cudaEvent_t.

The error indicates that you may use the invalid CUDA stream or CUDA event.
May I know how to you meet this error?

Thanks.

gzchenjiajun · March 6, 2020, 8:23am

I need to call the engine of tensorRT many times. At present, my code only loads the engine of tensorRT when it is started for the first time, and then it will be stored in a variable, and then the error will be reported in the second reasoning. What should I do?

gzchenjiajun · March 6, 2020, 8:24am

The code looks something like this:
“If the index = = 0:
Print (’ first allocate_buffers’)
Inputs, outputs, bindings, stream = common.allocate_buffers(engine)”

The first time when reasoning is normal, the second time can report wrong, how do I do excuse me?

gzchenjiajun · March 6, 2020, 8:28am

AastaLLL:

Hi,

cudaErrorInvalidResourceHandle = 400
This indicates that a resource handle passed to the API call was not valid. Resource handles are opaque types like cudaStream_t and cudaEvent_t.

The error indicates that you may use the invalid CUDA stream or CUDA event.
May I know how to you meet this error?

Thanks.

I need to call the engine of tensorRT many times. At present, my code only loads the engine of tensorRT when it is started for the first time, and then it will be stored in a variable, and then the error will be reported in the second reasoning. What should I do?

The code looks something like this:
“If the index = = 0:
Print (’ first allocate_buffers’)
Inputs, outputs, bindings, stream = common.allocate_buffers(engine)”

The first time when reasoning is normal, the second time can report wrong, how do I do excuse me?

Incidentally, I also loaded another model (a trace model of keras). Could it be an error caused by insufficient memory, and if so, how to solve it?

AastaLLL · March 9, 2020, 2:01am

Hi,

You can reuse these buffer instead of allocating new one for each inference:

Inputs, outputs, bindings, stream = common.allocate_buffers(engine)

Could you give it a try?

Thanks.

gzchenjiajun · March 10, 2020, 2:27am

So that’s what I did, the first time I foreach it was allocate_buffers, and then I got my error up there

AastaLLL · March 10, 2020, 5:26am

Hi,

We want to reproduce this issue in our environment to check it further.
Would you mind to share a simple reproducible script with us?

Thanks.

gzchenjiajun · March 10, 2020, 6:49am

I loaded a trace model (pbfile) and a detection model (TRT). When the detection model reasoning is run for the second time, an error “cuda error” will be reported. The code is as follows:

model_filename = 'model_data/mars-small128.pb'

        metric = nn_matching.NearestNeighborDistanceMetric("cosine", max_cosine_distance, nn_budget)
        tracker = Tracker(metric)

        conn = redis.Redis(host='127.0.0.1', port=6379, decode_responses=True)

        with self.get_engine(onnx_file_path, batch_size, fp16_on,
                             engine_file_path) as engine, engine.create_execution_context() as context:
            while True:
                    # if conn.get('time_interval') is not None:
                    #     continue

                    t1 = time.time()
                    print('start_time：' + str(time.time()))
                    # frame[:, :, [0, 1, 2]] = frame[:, :, [2, 1, 0]]
                    global current_frame
                    frame = copy.deepcopy(current_frame)
                    if frame is not None:
                        image = Image.fromarray(frame)
                        # print(time.time())
                        if index == 0:
                            b = BytesIO()
                            inputs, outputs, bindings, stream = common.allocate_buffers(engine)
                        # image.save(b, format="jpeg")

                        # print(time.time())

                        images = []
                        images_raw = []

                        image_raw, image = preprocessor.process(image)
                        # print(182)
                        # print(time.time())

                        images_raw.append(image_raw)
                        images.append(image)

                        index += 1
                        # if index != nums and len(images_raw) != batch_size:
                        #     continue
                        images_batch = np.concatenate(images, axis=0)
                        # print(time.time())

                        inputs[0].host = images_batch
                        # print(input_size)
                        # print(inputs)¨
                        # print(outputs)
                        print('common.do_inference start')
                        trt_outputs = common.do_inference(context, bindings=bindings, inputs=inputs, outputs=outputs,
                                                          stream=stream,
                                                          batch_size=batch_size)

I migrated and wrote based on this repository：
Qidian213/deep_sort_yolov3: Real-time Multi-person tracker using YOLO v3 and deep_sort with tensorflow
https://github.com/Qidian213/deep_sort_yolov3

This question has been bothering me for a long time, and my general thinking is as follows:

Solve the memory preemption problem based on pbfile and TRT?
Convert the trace model’s pbfile to TRT?

gzchenjiajun · March 10, 2020, 6:54am

I need your help. Thank you

gzchenjiajun · March 10, 2020, 7:00am

I want to make it clear that the code works when the trace model (pbfile) is not loaded…

How to solve this problem?

gzchenjiajun · March 11, 2020, 10:25am

Hello, how can I solve it?

AastaLLL · March 24, 2020, 8:48am

Hi,

Sorry for keeping you waiting.
Would you mind to share a complete source with us so we can reproduce this more easily?

Thanks.

AastaLLL · March 24, 2020, 9:01am

Hi,

Just for your reference:
We have another topic meeting the same “NCHWTONCHHW2: 33” error.

Their root cause is that CUDA context is closed by other frameworks when terminated (Tensorflow in their case).
Is there any possibility that the CUDA context in your app also be closed by other frameworks?

Thanks.

gzchenjiajun · March 26, 2020, 6:41am

Can you give some specific practical advice? How can I use both TRT and pb models?
Is it controlling video memory, or is it necessary to convert pb to TRT? Thanks for your advice

gzchenjiajun · March 26, 2020, 10:06am

I have read the issue you Shared, but his problem is a little different from mine, because he only runs a simple sess.run code, and I want to run the whole pb tracking model, now I haven’t solved this problem

gzchenjiajun · March 26, 2020, 11:09am

This problem has not been effectively solved. How to deal with it?
Check out my other topic
"Cuda Error in NCHWTONCHHW2: 33 (invalid resource handle) ", How to solve it? - Jetson & Embedded Systems/Jetson Nano - NVIDIA Developer associations

gzchenjiajun · March 26, 2020, 11:40am

My idea is that I plan to use CPU to execute another model, but I can’t find the tensorflow-cpu running method of jetson nano. Can you recommend the tutorial?

AastaLLL · March 27, 2020, 8:40am

Hi,

You can set this environment parameter to force TensorFlow run on CPU mode:

CUDA_VISIBLE_DEVICES=1

Here is a good tutorial for your reference:

Thanks.

gzchenjiajun · March 27, 2020, 9:52am

I mean, I did not install the tensorflow CPU library of jetson nano, I need to install the tensorflow CPU library first, I searched and did not find the corresponding library, may I ask how to install it?

Topic		Replies	Views
[TensorRT] ERROR: …/rtSafe/cuda/reformat.cu (925) - Cuda Error in NCHWToNCHHW2: 400 (invalid resource handle) Jetson Nano cuda	10	4298	October 15, 2021
Unable to run two TensorRT models in a cascade manner TensorRT tensorrt , python	7	4962	October 12, 2021
Cuda Runtime (invalid resource handle) when use TensorRT and Pytorch(on GPU) simultaneously TensorRT	5	2920	December 17, 2024
Looking for real fix for invalid resource handle error TensorRT jetson-inference , onnx	7	1522	July 28, 2021
TF-TRT issue Jetson TX2	26	3830	October 18, 2021
pycuda._driver.LogicError: cuStreamSynchronize failed: an illegal memory access was encountered TensorRT	1	1094	September 3, 2021
cuda error running YOLO-TensorRT-GIE- and ZED Jetson TX2	16	3493	February 21, 2018
Cuda initialization failure when converting trt model with different GPU TensorRT tensorrt	7	6423	September 28, 2022
Run yolov3_tiny.engine from python Jetson AGX Xavier tensorrt , yolo	16	2076	October 18, 2021
Adding multiple inference on TensorRT (Invalid Resource Handle Error) TensorRT	2	1706	December 4, 2019

"Cuda Error in NCHWTONCHHW2: 33 (invalid resource handle) "，How to solve it?

Related topics