cudaErrorInvalidResourceHandle = 400
This indicates that a resource handle passed to the API call was not valid. Resource handles are opaque types like cudaStream_t and cudaEvent_t.
The error indicates that you may use the invalid CUDA stream or CUDA event.
May I know how to you meet this error?
I need to call the engine of tensorRT many times. At present, my code only loads the engine of tensorRT when it is started for the first time, and then it will be stored in a variable, and then the error will be reported in the second reasoning. What should I do?
The code looks something like this:
“If the index = = 0:
Print (’ first allocate_buffers’)
Inputs, outputs, bindings, stream = common.allocate_buffers(engine)”
The first time when reasoning is normal, the second time can report wrong, how do I do excuse me?
I loaded a trace model (pbfile) and a detection model (TRT). When the detection model reasoning is run for the second time, an error “cuda error” will be reported. The code is as follows:
model_filename = 'model_data/mars-small128.pb'
metric = nn_matching.NearestNeighborDistanceMetric("cosine", max_cosine_distance, nn_budget)
tracker = Tracker(metric)
conn = redis.Redis(host='127.0.0.1', port=6379, decode_responses=True)
with self.get_engine(onnx_file_path, batch_size, fp16_on,
engine_file_path) as engine, engine.create_execution_context() as context:
while True:
# if conn.get('time_interval') is not None:
# continue
t1 = time.time()
print('start_time:' + str(time.time()))
# frame[:, :, [0, 1, 2]] = frame[:, :, [2, 1, 0]]
global current_frame
frame = copy.deepcopy(current_frame)
if frame is not None:
image = Image.fromarray(frame)
# print(time.time())
if index == 0:
b = BytesIO()
inputs, outputs, bindings, stream = common.allocate_buffers(engine)
# image.save(b, format="jpeg")
# print(time.time())
images = []
images_raw = []
image_raw, image = preprocessor.process(image)
# print(182)
# print(time.time())
images_raw.append(image_raw)
images.append(image)
index += 1
# if index != nums and len(images_raw) != batch_size:
# continue
images_batch = np.concatenate(images, axis=0)
# print(time.time())
inputs[0].host = images_batch
# print(input_size)
# print(inputs)¨
# print(outputs)
print('common.do_inference start')
trt_outputs = common.do_inference(context, bindings=bindings, inputs=inputs, outputs=outputs,
stream=stream,
batch_size=batch_size)
I migrated and wrote based on this repository:
Qidian213/deep_sort_yolov3: Real-time Multi-person tracker using YOLO v3 and deep_sort with tensorflow https://github.com/Qidian213/deep_sort_yolov3
This question has been bothering me for a long time, and my general thinking is as follows:
Solve the memory preemption problem based on pbfile and TRT?
Just for your reference:
We have another topic meeting the same “NCHWTONCHHW2: 33” error.
Their root cause is that CUDA context is closed by other frameworks when terminated (Tensorflow in their case).
Is there any possibility that the CUDA context in your app also be closed by other frameworks?
Can you give some specific practical advice? How can I use both TRT and pb models?
Is it controlling video memory, or is it necessary to convert pb to TRT? Thanks for your advice
I have read the issue you Shared, but his problem is a little different from mine, because he only runs a simple sess.run code, and I want to run the whole pb tracking model, now I haven’t solved this problem
This problem has not been effectively solved. How to deal with it?
Check out my other topic
"Cuda Error in NCHWTONCHHW2: 33 (invalid resource handle) ", How to solve it? - Jetson & Embedded Systems/Jetson Nano - NVIDIA Developer associations
My idea is that I plan to use CPU to execute another model, but I can’t find the tensorflow-cpu running method of jetson nano. Can you recommend the tutorial?
I mean, I did not install the tensorflow CPU library of jetson nano, I need to install the tensorflow CPU library first, I searched and did not find the corresponding library, may I ask how to install it?