HI, everyone:
I am a new tensorrt programmer.
Description
If I run tensorrt demo standalone, it’s fine. But when i use tensorrt with celery,pycuda context has something wrong. I know celery subprocess is prefork,not spawn, have some methods to solve this situation?
key code:
celery task.py
@celery.signals.worker_process_init.connect
def worker_process_init(sender, **kwargs): #worker init
import onnx
import onnx_tensorrt.backend as backend
import pycuda.driver as cuda
import pycuda.autoinit
model1 = onnx.load(“/data/new/tensorRT/ir152_test_op11.onnx”)
engine1 = backend.prepare(model1, device=‘CUDA:0’) # ERROR is here
Trace:
Segmentation fault: 11
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x39008a) [0x7f47213b208a]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x31b3f46) [0x7f47241d5f46]
[bt] (2) /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7f477523d4b0]
[bt] (3) /usr/local/lib/python2.7/dist-packages/torch/lib/libcaffe2.so(std::_Hashtable<std::string, std::pair<std::string const, std::pair<std::unordered_set<std::string const*, std::hash<std::string const*>, std::equal_to<std::string const*>, std::allocator<std::string const*> >, std::string> >, std::allocator<std::pair<std::string const, std::pair<std::unordered_set<std::string const*, std::hash<std::string const*>, std::equal_to<std::string const*>, std::allocator<std::string const*> >, std::string> > >, std::__detail::_Select1st, std::equal_tostd::string, std::hashstd::string, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::clear()+0x79) [0x7f46877112a9]
[bt] (4) /usr/local/lib/python2.7/dist-packages/onnx/onnx_cpp2py_export.so(+0x458a7) [0x7f465d4ac8a7]
[bt] (5) /usr/local/lib/python2.7/dist-packages/onnx/onnx_cpp2py_export.so(+0x1407a6) [0x7f465d5a77a6]
[bt] (6) /usr/local/lib/python2.7/dist-packages/onnx/onnx_cpp2py_export.so(+0xf8837) [0x7f465d55f837]
[bt] (7) /usr/local/lib/python2.7/dist-packages/onnx/onnx_cpp2py_export.so(+0xf39ce) [0x7f465d55a9ce]
[bt] (8) /usr/local/lib/python2.7/dist-packages/onnx/onnx_cpp2py_export.so(+0xfb6c5) [0x7f465d5626c5]
[bt] (9) /usr/local/lib/python2.7/dist-packages/onnx/onnx_cpp2py_export.so(+0x167e2d) [0x7f465d5cee2d]
PyCUDA ERROR: The context stack was not empty upon module cleanup.
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
[2020-09-18 12:39:15,811: ERROR/MainProcess] Process ‘ForkPoolWorker-10’ pid:75998 exited with ‘signal 6 (SIGABRT)’
Environment
TensorRT Version:7.0.0.11
GPU Type: Tesla P4
Nvidia Driver Version: 384.130
CUDA Version: 9.0
CUDNN Version: 7.4.3
Operating System + Version:ubuntu16.04
Python Version (if applicable): 2.7.12
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
Relevant Files
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered