TensorRT Python Client Runtime Error

leonard.yeo · September 3, 2019, 6:00am

I am having issues running TensorRT python client on one of our systems. More specifically, TensorRT python client runtime API cannot run on multiple processes. TensorRT python client runtime API is working on a single process.

Here are the methods/work arounds/tests that I have tried,
1.Running the TensorRT runtime API outside of the multi processing first, then running the created runtime object into the multi process. The runtime object then deserialises into a cuda engine. This gives me a runtime error ([TensorRT] ERROR: cudaDeviceProfile.cpp (52) - Cuda Error in generateForCurrent: 3 (initialization error))
2.Works fine on the python console
3.Works fine on the TensorRT python sample code
4.Created TensorRT engine and context outside of multiprocessing process and injected the objects into the process. However, during inference, an error occured, “[TensorRT] ERROR: engine.cpp (370) - Cuda Error in ~ExecutionContext: 3 (initialization error)
terminate called after throwing an instance of ‘nvinfer1::CudaError’
what(): std::exception”

Please assist on this.

mchi · September 5, 2019, 2:01pm

Please refer to - NVIDIA Deep Learning TensorRT Documentation

The TensorRT runtime can be used by multiple threads simultaneously, so long as each object uses a different execution context.

So, do your runtimes run on different contexts?

luisyin · September 5, 2019, 3:23pm

Please check my post:[url]Can tensorrt do inference in python thread or subprocess? - TensorRT - NVIDIA Developer Forums

It could be useful to you.

leonard.yeo · September 6, 2019, 8:56am

My implementation of multiprocessing is created using the ‘fork’ instead of ‘spawn’ method. May I know if TensorRT python client is compatible with the ‘fork’ method of multiprocessing. Please assist. Thank you.

mchi · September 6, 2019, 1:11pm

Sorry! I misread it multithread! Even the CUDA context can’t be shared between processes, IOW, each process own its own CUDA context. So the CUDA based TensorRT can’t share any objects between processes.
In your case, when a new process access the CUDA resouce that created on another CUDA conext, it’s expected to report - Cuda Error in generateForCurrent: 3 (initialization error)).
BTW, I would highlight that a CUDA kernel from one CUDA context cannot execute concurrently with a CUDA kernel from another CUDA context, so multiple CPU proceess with multiple CUDA conext may can’t utilize the GPU 100%.

mchi · September 6, 2019, 1:30pm

If you still want to run with multiple process, you need to create own TensorRT resource in different process.
And, refer to Frequently Asked Questions about PyCUDA - Andreas Klöckner's Former Wiki and luisyin’s comment for CUDA creation in different process.

leonard.yeo · September 19, 2019, 8:32am

Noted. I have tried implementing the TensorRT runtime into the multiprocessing. However, it still did not resolve the issue. Would this be resolved in the future releases of TensorRT?

Topic		Replies	Views
Error in cuda when trying to inference via multiprocessing TensorRT	2	1694	November 14, 2021
Tensorrt inference with pytorch tensor(data_ptr) TensorRT tensorrt , cuda , pytorch	2	1864	June 11, 2021
How to implement TensorRT as an inference server? TensorRT	2	1734	October 24, 2019
How to use TensorRT by the multi-threading package of python Jetson AGX Xavier tensorrt	13	18738	October 18, 2021
how to run trt in multithreading？ Jetson TX2	15	7972	October 18, 2021
Tensorrt Threads affect each other during multithreaded inference TensorRT tensorrt	16	1400	September 6, 2024
[TensorRT] engine happed a error in multithreaded TensorRT tensorrt , cuda	2	1564	January 19, 2023
How to use TensorRT in python multiprocessing environment? TensorRT cuda	1	1293	February 17, 2021
TensorRT do_inference error TensorRT	19	8407	November 14, 2022
Tensorrt multiple process TensorRT tensorrt	2	1567	February 21, 2024

TensorRT Python Client Runtime Error

Related topics